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The study of the molecular biology of immunoglobulin (Ig) genes 
represents one of the first triumphs of recombinant DNA technology 
Before the advent of gene cloning, Ig genes could be studied only 
indirectly by inferences from amino acid sequences. Many perplex- 
ing questions were resolved when it became possible to examine 
directly the genes themselves. Recently the cloned genes have moved 
beyond the pure research laboratory to be used as tools for various 
applied engineering projects. This chapter summarizes some of these 
exciting advances in both the basic and applied arenas. 

The unique mystery of antibody genes lies in the diversity of 
proteins they encode. This diversity exists at several levels. 

Most striking is the diversity of antigen-combining sites of these 
molecules. The classic studies of Landsteiner suggested that the 
repertoire of binding specificities of antibodies is essentially unlim- 
ited. The diversity of binding specificities is explained by the diver- 
sity of amino acid sequences found in the N-terminal domain of 
both light (L) and heavy (H) chains — the variable (V) region — each 
containing three regions of especially high variability (hypervari- 
able regions) which correspond to the loops of the protein that con- 
tact antigen, or complementarity determining regions (CDRs), as 
discussed in Chapter 3. Yet on the C-terminal end, the single domain 
of the L chain and the three (or four depending on isotype) domains 
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of H chains were found to be invariant within each class of L or H 
chains; these segments are designated constant (C) regions. Many 
models were proposed to explain the unprecedented diversity found 
in Ig V regions. One extreme model suggested that the immense 
diversity of V regions was directly encoded in the germline genome, 
presumably a result of gene duplication and mutation acting over 
evolutionary time. At the other extreme, the somatic mutation 
model supposed that very few V-region sequences were encoded in 
the genome and that a special somatic mutation mechanism oper- 
ated on these sequences to increase diversity within the life span of 
the organism. Regardless of whether sequence diversification 
occurred in phylogeny (germline diversity) or ontogeny (somatic 
mutation), another question remained: How did the C regions of Ig 
genes escape such changes? In 1965, Dreyer and Bennett (1) pro- 
posed that for each class of Ig genes there might be only a single 
C-region gene, which was encoded in the germline separately from 
the multiple V-region genes; in the development of an antibody- 
producing cell, one of the V-region sequences would become asso- 
ciated with the C-region sequence, leading to a complete (V + C) 
gene, which the cell could then express. Thus, mechanisms thai 
increase diversity in the isolated V-region genes might leave the 
single C-region gene at its distant locus untouched. This model, 
with its proposal of gene rearrangement occurring independently 
in each lymphocyte, was revolutionary in that it violated the then- 
accepted notion that DNA is the same in all cells of the organism. 
Clearly, a definitive assessment of Dreyer and Bennett's proposal 
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and an evaluation of the relative significance of somatic and 
germline interpretations of V-region diversity required a direct 
analysis of the genes in question. Two additional mysteries: given 
the fact that each B lymphocyte should contain two copies of each 
gene locus (i.e., from the maternally and paternally derived chro- 
mosomes), why does the cell express only a single L chain and H 
chain, as if the locus on the nonexpressed chromosome were some- 
how silenced — the phenomenon known as "allelic exclusion"? And 
how can one explain the fact that affinity of serum antibodies for 
antigen increases over a period of weeks after antigen exposure — 
the phenomenon of "affinity maturation"? 

Apart from the diversity of V regions in both L and H chains, H 
chains exhibit a different sort of diversity that also demands a mol- 
ecular biologic explanation: all developing B cells synthesize IgM 
initially and can switch H-chain isotype from |i to y, e, or a only 
later in their maturation. As the expressed C-region "switches" the 
cell continues to express the same L- and H-chain V regions, so 
that antigen specificity remains unchanged. Thus, in addition to 
understanding how, in different cells, a single C region can become 
associated with multiple different V regions (V-C recombination), 
we need to consider the molecular mechanism by which, during 
lymphocyte development, a single V region may become associ- 
ated sequentially with several C regions (H-chain switch). 

A final level of diversity exhibited by Ig H chains is represented 
by the alternative forms of Ig found embedded in the membrane of 
B cells versus those in blood and secretions. Membrane Igs have 
C-terminal extensions containing hydrophobic amino acids that 
associate with membrane lipids, whereas secreted Igs lack this C- 
terminal piece but are otherwise identical to the membrane coun- 
terparts. Analysis of Ig genes has shown how these two forms are 
encoded in the genome. 

This chapter will begin with a brief discussion of V gene assem- 
bly in H- and L-chain genes. We then describe the H-chain locus — 
including molecular explanations for the membrane forms of Ig 
and isotype switching — followed by descriptions of K and X gene 
loci; however, a detailed discussion of each germline V-gene locus 
is deferred until later in the chapter. Next we consider in detail the 
DNA recombination events underlying V-gene assembly and the 
regulation of this process to maintain allelic exclusion. The chapter 
continues with a discussion of the mechanisms contributing to V- 
region diversity: the germline V repertoire, junctional diversity, and 
somatic mutation. A discussion of the regulation of Ig gene expres- 
sion follows. The chapter ends with several topics in the "applied 
science" of Ig genes. 

The investigations described in this chapter have been chosen 
from the literature to facilitate a clear exposition of the important 
issues rather than to provide a comprehensive compendium of data 
and references on Ig genes. In these descriptions, most of the dis- 
cussion focuses on murine and human Ig genes. Murine genes were 
studied first because of the availability of pristane-induced murine 
myelomas of BALB/c mice, which served as convenient mono- 
clonal sources of Ig protein for early structural studies. The same 
myelomas then provided messenger RNA (mRNA) and DNA for 
molecular biology analysis, which was greatly facilitated by the 
fact that these myelomas derived from the same genetic back- 
ground — the inbred BALB/c strain. Later, study of the homologous 
human loci showed many fundamental similarities between .the Ig 
genes of these two species, whereas some other mammalian orders 
show surprisingly significant differences. 

Isotype switching and somatic mutation of Ig genes are covered 
in more detail in separate chapters of this text (Chapters 23 and 24). 



OVERVIEW OF IMMUNOGLOBULIN 
V-GENE ASSEMBLY 

In the late 1970s, experiments on L-chain genes established thai 
the Dreyer-Bennett hypothesis was fundamentally correct: each 
lymphocyte expresses only a single Ig molecule encoded by one 
VL and one VH gene, each having been "activated" by a recombi- 
nation event that brings the V gene near its respective C-regior. 
gene. This conclusion was supported by comparisons of Ig gene; 
from B-lymphoid cells, particularly murine myelomas, and the cor- 
responding gene loci from "germline" DNA. (Although true 
germline DNA can experimentally be obtained only from sperm 
any nonlymphoid DNA is assumed to be representative of germline 
DNA because the rearrangements of Ig genes occur only in lym- 
phoid cells. When DNA from sperm versus other nonlymphoid tis- 
sues has been compared by Southern blots, the results have beer 
identical. Therefore, despite the risk of some imprecision, nonlym- 
phoid DNA samples are. conventionally referred to as germline 
whether the DNA is from sperm, whole embryo, liver, placenta, oi 
other nonlymphoid sources.) 

Evidence from Southern Blots and Gene Cloning 

Initially the myeloma and germline DNA samples were comparec 
by Southern blotting using hybridization probes derived frorr 
myeloma complementary DNA (cDNA). As schematically shown ir 
Fig. 1 for an analysis of k L-chain genes, a Ck probe detects only £ 
single band in germline DNA, consistent with a single Ck gene. A 
probe representing an expressed Vk gene detects several bands, as 
though hybridizing to a family of related sequences. Moreover 
although not shown in Fig. 1, probes representing different 
expressed Vk genes are found to hybridize to a different set oi 
bands, representing a different family of related Vk genes. These 
observations support the hypothesis of multiple V genes, single C 
gene. The novel recombination postulate of the Dreyer-Benneti 
hypothesis is supported by the differences observed when these 
probes are hybridized to myeloma DNA instead of germline DNA 
As shown in Fig. 1 , the recombination bringing a V gene close to £ 
C gene can cause an alteration in size of the CK-hybridizing restric- 
tion fragment. The new rearranged band may be larger, smaller, oi 
fortuitously the same size as the germline band, depending on the 
location of the restriction sites flanking the V and C genes. One o1 
the V-region bands may similarly be expected to be rearranged in the 
myeloma so as to lie on a different-sized fragment, the same frag- 
ment that hybridizes to the Ck probe. Results like these for K and * 
genes strongly supported the Dreyer-Bennett hypothesis and force- 
fully challenged the concept that every cell in the body has identical 
genes (2,3). In panels of myelomas analyzed for Ck recombination 
by Southern blotting, many showed evidence of DNA rearrangemeni 
on both allelic chromosomes. This result argued against the possi- 
bility that allelic exclusion might be explained by a mechanism thai 
allowed recombination on only one chromosome, and it raised ques- 
tions about the nature of the "second" gene rearrangement in these 
cells, as discussed later in this chapter. 

A more complete understanding of recombination of Ig genes 
developed from sequence analysis of cloned myeloma versus 
germline DNA. The general structures of the germline V genes are 
similar for the three Ig loci: H chain, k, and X. Each V gene begins 
with a sequence encoding a signal peptide of about 22 amino acids. 
(Signal peptides are found at the N-terminus of most proteins des- 
tined for secretion or expression on the cell membranes; after rout- 
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FIG 1 Southern blot demonstration of rearrangement of Ig V and C region genes. EcoRI sites in this hypotheWcal 
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ing the protein to the endoplasmic reticulum, the peptide is generally 
removed by specific peptidases.) Within codon 4 (numbering back- 
ward from the beginning of the mature protein sequence), the coding 
sequence is interrupted by an intron, usually 0. 1 to 0.3 kb long. What 
was unanticipated was the discovery that each V-region gene as it 
exists in the germline is incomplete, and that recombination is nec- 
essary to assemble a complete V gene (4). For example, most murine 
k chains have V regions 108 amino acids in length, but murine 
germline Vk genes encode only about 95 of these. The remaining 13 
amino acids are encoded by segments known as J (joining) regions 
that lie upstream of the C-region gene (5,6). An assembled Vk gene 
thus results from recombination that joins one of many germline Vk 
genes to one of five Jk gene segments (Fig. 2 A). A similar recombi- 
nation event is necessary to assemble a complete VX-chain sequence 
from germline VX and }X genes (7). For H chains, recombination 
assembles a V region from three types of germline elements; 
between the residues encoded by germline VH and JH elements there 
are interposed variable numbers of amino acids — commonly from 
zero to eight residues— encoded by a D (diversity) region. The 
assembly of a complete H-chain V region occurs in two separate 
steps (Fig. 2B): initially one of several germline DH regions joins 
with one of the JH regions; then a germline VH region is added to 
complete the assembled VDJ H-chain gene. 

How Recombination Contributes to Diversity 

The V-assembly recombination contributes in two significant 
ways to the diversity of antigen-binding specificities. First, because 
there are multiple germline V regions and multiple D and J regions, 
the number of possible combinations of VX, J\, VH, DH, and JH is 
the multiplication product of the numbers of each of these five 
classes of germline sequence elements. This repertoire is vastly 
larger than could be achieved by devoting the same total lengths of 



DNA sequence to preassembled V regions. A second factor that 
increases diversity was recognized by comparing nucleotide 
sequences of various myeloma genes to their germline precursors. 
For example, as shown in Fig. 3A, a comparison between the Vk 
gene expressed in the murine myeloma MOPC41 and the corre- 
sponding germline Vk and Jk genes shows that the myeloma gene 
matches the germline precursor through the second nucleotide of 
codon 95; the VJ recombination junction clearly occurs at this 
point because sequence beyond this position in the myeloma gene 
clearly derives from JkI . Similar analyses of other myelomas show 
that the recombination junctions can occur at several different posi- 
tions within codon 95 or 96. As shown in Fig. 3B, this flexibility of 
the position of the recombination junction increases the diversity of 
the affected codons. H-chain V regions exhibit this flexibility at 
both VD and DJ junctions. In addition, many H-chain VDJ junc- 
tions (and a smaller percentage of L-chain VJ junctions) show 
insertions of a few extra nucleotides not present in the germline 
precursors; the mechanism of these insertions will be discussed 
later in this chapter. Significantly, the three-dimensional structure 
of Igs established from x-ray crystallography shows that the VX-JL 
junction and the VH-DH-JH junction both form CDR3 loops that 
can contact antigen; thus this junctional diversity is physiologically 
relevant for diversifying antigen binding. The important role of D 
junctional amino acids for antigen binding has been verified by 
mutational analysis (8). In addition, many H-chain VDJ junctions 
(and a smaller percentage of L-chain VJ junctions) reveal inser- 
tions of a few extra nucleotides not present in the germline precur- 
sors; the mechanism of these insertions— known as N regions- 
will be discussed later in this chapter. 

When the flexibility of the position of recombination was initially 
discovered, it was hard to understand how the germline elements 
could be joined with such variability and yet maintain the correct 
triplet reading frame between V and J. (An out-of- frame recombi- 
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nation would cause the entire C region to be read in a nonsense 
reading frame, so the gene would be nonfunctional.) It soon became 
clear, however, that if one looks beyond the subset of assembled V 
regions that are expressed in myeloma antibodies-a subset selected 
for expression of a functional L and H chain-one can find many 
assembled V genes with out-of-frame recombination junctions (9). 
Indeed, in unselected VJ recombinations the frequency of in-frame 
junctions is about 1/3, as predicted for a recombination mechanism 
insensitive to reading frame. In myelomas with rearrangement on 
both allelic copies of an Ig gene locus, the unexpress ^rec o mbina- 
tion is generally out-of-frame or "non-productive." For H-chain VDJ 
recombination, one could theoretically retain the correct reading 
frame between V and J while allowing the interposed D-region seg- 
ments to be used in all three reading frames. In murine H chains, 
however, only a single D-region reading frame is generally founj 
and several mechanisms prevent expression of antibodies with D 
regions in the other two reading frames (10). In human antibodies 
this intense selection against variant reading frames is not found 
fU) allowing for additional sequence diversity. The generation ot 
V-region diversity in the three Ig gene loci (IgH, k, and X) is con- 
sidered in more detail in a later section. 

Recombination Signal Elements 

Analysis of DNA sequences flanking the germline V-, D-, and J- 
region sequences showed two conserved sequence elements that 
apparently play a role in the recombination event signaling the 
position where the DNA should rearrange. The first S1 gnal element 
is a 7-mer CACTGTG that occurs as a consensus sequence 5 to the 
Jk coding sequences, with its (reverse) complement CACAGTG 
appearing 3' to the Vk coding sequences. The second element is a 
9-mer GGTTTTTGT that appears about 23 nucleotides 5 to the JK 
7-mer its complement ACAAAAACC appearing about 12 
nucleotides 3' to the Vk 7-mer (5,6). The likelihood that these 
recombination signal sequences (RSS) are significant in the 
recombination is reinforced by their appearance at similar positions 
in L- and H-chain Ig genes throughout phylogeny as well as in T- 
cell receptor (TCR) genes (see Chapter 10), which undergo similar 
V assembly recombinations; furthermore, there are no other well- 
conserved sequences flanking these genes. In all of these systems 
the length of the spacer between the 7-mer and 9-mer appears sig- 



nificant. Recombination apparently occurs only between one cod- 
ing sequence with a 12-bp spacer and another coding sequence 
with a 23-bp spacer, a requirement referred to as the 12/23 rule. 
The benefit of this requirement may be that futile recombinations, 
such as between two Vk or two JK gene segments, are prevented. 
Although a computerized alignment of several hundred spacer 
sequences has detected some preferred nucleotides at specific posi- 
tions (12), mutations of spacer sequences in plasmid recombmauon 
substrates have little effect on recombination frequency. The length 
of the spacers flanking H- and L-chain V, D, and J elements are 

shown in Fig. 4. n 
Although the complementarity of the Vk and Jk copies of the 7- 
mer and 9-mer signal elements initially led to the hypothesis that 
these elements might participate in the formation of a stem-and- 
loop intermediate in the recombination reaction, current evidence 
strongly favors an alternative role for the RSS: as recognition 
sequences for DNA-binding proteins mediating the recombuiatior^ 
This evidence is presented later in a detailed discussion of V(D)J 
recombination. . 

Because of the conservation of the RSS elements among k, A., 
IgH genes, and TCR genes, the enzymatic recombinase machinery 
that assembles complete V genes from germline precursors is 
believed to be the same in all these systems. This notion is rein- 
forced by much other evidence, including the observations that 
eermline TCR V-gene segments can be correctly rearranged when 
introduced into pre-B cells and that hybrid Ig-TCR rearrangement 
can occur (although only in abnormal cells, as discussed in a later 
section). 

THE THREE IMMUNOGLOBULIN GENE LOCI 

This section presents an overview of the three Ig loci: H chain, 
k and X The V regions of these loci are described in a later section 
on germline diversity (except that the tiny murine V\ repertoire is 
discussed in the present section). 

Heavy-Chain Genes 

In the development of a B-lymphocyte, the cell initially produces 
IgM with a binding specificity determined by the productively 
rearranged VH and VL regions. Subsequently, each B cell and its 
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progeny cells synthesize antibodies with the same L- and H-chain 
V regions; but they may later switch the isotype of the H chain. 
Early evidence for this developmental scheme includes (among 
other observations) (a) the isotype shift seen during the course of an 
immune response (13); (b) the ability of B-cell clones — myelomas 
( 1 4, 1 5) and splenic foci (16) — to express IgM plus another isotype, 
with identical VH regions; and (c) in vivo ablation studies suggest- 
ing that IgM-producing cells are the precursors of IgG producers 
( 1 7). The molecular mechanism by which one part of a protein can 
change while another part remains unchanged has generated con- 
siderable interest. 

Several groups (18-20) have demonstrated that active 
rearranged a, 72b, and yl genes isolated from myelomas express- 
ing the respective H chains contain — between their V and C 
regions — DNA sequences derived from the DNA upstream of the 
germline Cji gene, including one or more JH sequences. These 
observations led to the model (Fig. 5) that the VH region 
rearranges initially to a position 5' to the u, gene (leading to IgM 
production), and that when a cell expresses a new isotype the C\i- 
region gene is replaced by the CH region encoding the new isotype. 
This isotype switch appears to result from a deletion of the CH 
genes between the assembled VDJ and the CH gene expressed after 
the switch. Early support for this deletion model came from analy- 
sis of the content of specific CH genes in myelomas that had under- 
gone different switch recombinations. Solution hybridization kinet- 
ics or Southern blotting with cDNA-derived CH probes confirmed 
that switching was associated with loss of CH sequences from the 
cell. From the specific C regions lost in myelomas expressing dif- 
ferent isotypes, it was possible to predict a linear order of the dif- 
ferent CH genes on the chromosome (21,22). 

A more detailed picture of the H-chain locus emerged as many 
laboratories reported the isolation of genomic clones for CH genes. 
In general, these clones were obtained in the early 1980s by screen- 
ing genomic DNA libraries with cDNA probes derived from 
myeloma mRNA. From the wealth of data generated, we can con- 
sider only a few interesting conclusions because of space limitations. 

One striking characteristic of CH genes is that the 100 to 110 
amino acid domains — identified by internal homologies of amino 
acid sequences and by three-dimensional structural analysis (x-ray 
crystallography) — are encoded as intact exons, separated from 
other domain segments by introns of 0. 1 to 0.3 kb (23—25). Thus, 
for example, the mouse 72b protein has three major domains (CHI, 
CH2, and CH3), with a small hinge domain between CHI and CH2. 
The gene structure (23,26,27) may be summarized as follows: 

CHI — intron — hinge — intron — CH2 — intron — CH3 
(292) (314) (64) (106) (328) (119) (322) 

where the numbers in parentheses represent the number of 
nucleotides in each segment. As an interesting contrast, the hinge 
region of the a gene is encoded contiguously with the CH2 domain 



with no intervening intron (25), whereas the unusually long huma 
y3 hinge is encoded by three or four hinge exons (28). Analyses * 
genomic CH genes have led to speculations that the evolutionar 
history of H-chain genes may have included mutations that create 
or destroyed RNA splice sites and thereby converted portions t 
intron sequence into exon and vice versa. For example, th 
sequence of the intron 5' to the hinge of the mouse Y2b gene sho* 
a surprising degree of similarity with the sequence of CHI; th 
observation led to the speculation (23) that the hinge exon ma 
have originated from a full Ig domain that became foreshortene 
either by the destruction of the RNA splice site at the 5' end of th 
domain or the creation of a new splice site within the domain. 

About 7 kb upstream from the murine C|a gene lies a cluster t 
four JH segments (six JH segments in humans) that participate 1 
VDJ recombination. Further upstream lie 13 D segments (about 2 
in humans) and beyond them the VH regions. V and D regions ar 
described later in this chapter in the section on V-region diversity 

Membrane Versus Secreted Immunoglobulin 

Studies of IgH gene and cDNA structure have provided an expla 
nation for the alternative membrane and secreted forms of the I 
chain. As noted earlier, the membrane -bound forms of Ig H chain 
are slightly larger than the secreted forms owing to an additional C 
terminal hydrophobic segment that anchors the protein in mem 
brane lipids (29). In the case of the u. chain, these two forms ar 
products of two different mRNAs of 2.7 and 2.4 kb, which can b 
separated by gel electrophoresis. By comparing the DNA sequenc 
of a genomic u. clone and u, cDNA clones corresponding to thes 
two RNA species, several laboratories (30-33) have demonstrate 
that the two RNA species represent transcripts of the identical gen 
that have been spliced differently at their 3' or C-terminal end 
(Fig. 6). The nucleotide sequence encoding the 20 C-termin£ 
residues of the secretory Qis) form is derived from DNA contigu 
ous with the CH4 domain of the u. gene, whereas in the membran 
mRNA Qim) the sequence after CH4 derives from two exons aboi 
2 kb 3' further downstream. These membrane exons encode 4 
residues, including a stretch of 26 uncharged residues that span th 
membrane to fix the Ig to the cell surface. The same general gen 
structure has been found for the other CH genes (34-37), sugges 
ing that the differential splicing mechanism probably accounts fc 
the two forms of Ig of all isotypes. 

Early B cells make substantial quantities of both |im and \ju 
whereas maturation to the plasma cell stage is associated with stron 
predominance of jxs production, consistent with the function of sue 
cells in generating the pool of circulating Ig. The balance betwee 
the two RNA splice forms of u, has been interpreted as a compet 
tion between CH4-M1 splicing and the cleavage/polyadenylation £ 
the upstream (is poly(A) addition site. The factors influencing thi 
balance have been studied by transfecting either early or late B cell 
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with u. gene sequences or constructs in which the splice sites or 
cleavage/poiyadenylation sites have been mutated, placed different 
distances apart, or rearranged in different order on the transfected 
gene construct. In some experiments constructs have been injected 
into frog oocytes with or without B-cell nuclei. Conflicting inter- 
pretations have emerged as to whether the critical factor influencing 
the urn/us ratio is differential splicing (38) or poly(A) site choice 
(39 40). It does appear that the length of the intron between CH4 and 
Ml' has an influence (41,42), that a stem-loop RNA structure at the 
3' end of the CH4-M 1 intron may play a role (43), and that the 
mechanisms regulating this ratio may be different for different iso- 
types (44). Additional investigations will be necessary to explain 
exactly how cell maturation leads to an appropriate alteration in the 
ratio of the membrane and secreted forms of Ig. 

Membrane Ig serves as the antigen-specific component of the B- 
cell receptor (BCR), which is critical for initiating the signal for 
lymphocyte activation on contact with antigen, as described in 
Chapter 7. The segments of membrane Igs (of all isotypes) that 
penetrate into the cytoplasm are too short to encode functional sig- 



nal transduction domains. Instead, transduction is mediated by an 
associated protein dimer composed of the BCR components Iga 
and IgB This dimer also has important signaling roles during B- 
cell development before the mature BCR is assembled, as dis- 
cussed later in this chapter 

Organization ofCH Gene Loci 

As genomic clones for the C-region genes of the H-chain loci of 
humans and mice were obtained, efforts were made to "link" them, 
i.e., to clone continuous stretches of DNA including the CH genes 
as well as all the DNA lying between them in the genome. The gen- 
eral strategy of this work was to use cDNA clones to obtain the CH 
genes and to use "gene walking" techniques to fill in the noncod- 
ing DNA between the genes. The murine locus was completely 
linked in 1982 with a report (45) of clones covering the entire 
region of the mouse genome— all eight CH genes— spanning about 
200 kb of DNA on chromosome 12. These clones define the gen- 
eral structure of the region as shown in Fig. 7, where the numbers 
indicate the distance in kilobases between the genes. All the CH 
genes are oriented in the same 5' to 3' direction. Recent sequence, 
analysis has shown several y pseudogenes within the clustered y 
genes (46). 

The human CH genes also have been cloned and localized to 
chromosome 14q32 (47), but not completely linked as of this writ- 
ing One significant difference between the human and murine IgH 
loci is that a large duplication exists in the human at the 3' end of 
the H-chain gene locus, with two copies of a W« ^ ( 48 > 49 ) 
(Fig. 7). One of the duplicated e sequences is a pseudogene m 
which the CHI and CH2 domains have been deleted (49,51). In 
addition, the human genome contains a third closely homologous 
e-related sequence: a "processed" pseudogene found on chromo- 
some 9 (50,51). (Pseudogenes of this type appear to have been 
reverse-transcribed from a processed RNA intermediate and then 
to have been inserted in the genome at locations unrelated to the 
original locus of the transcribed source gene.) A 7-related pseudo- 
gene lacking a switch region is also present in the human IgH locus 
between the two y-y-z-a duplications (52). The map presented in 
Fig 7 is based on partial contiguous overlaps and pulsed field gel 
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electrophoresis (PFGE) (53,54). PFGE allows electrophoretic sep- 
aration of very large fragments — up to several megabases in 
length — that cannot be separated by conventional electrophoresis; 
fragments of this length are useful in mapping over long distances 
and can be generated by restriction enzymes that have unusually 
rare recognition sequences. The map of the human IgH locus in 
Fig. 7 is consistent with the known deletions in the H-chain locus 
(55), as diagrammed in the same figure. 

The IgH locus also has been examined in several other species 
besides mice and humans, and several notable differences have 
been observed. Rabbits, for example, have 13 Cot sequences and 
only a single Cy gene (56); this unusual expansion of genes con- 
tributing to mucosal immunity may be related to the peculiar habit 
of coprophagy in these animals. In contrast to the multiplicity of 
rabbit Ca genes, pigs have only one Cot gene and eight Cy genes 
(57). Camels are unusual in having H chains that function in the 
absence of L chains (58). H-chain Ig genes (VH or CH) have been 
cloned from a number of other species, including rats [which are 
highly homologous with mice (59)], cows (56), chickens (60-63), 
horses (64), sharks (65), bony fish (66,67), crocodiles (68), frogs 
(69) and axolotls (70). 

Heavy-Chain Switch 

Switch Regions 

The availability of genomic IgH clones allowed detailed 
sequence analysis of the deletional switch recombination. The 
active switched genes from several myelomas were compared with 
the corresponding germline CH genes and with the germline jx 
gene, with particular attention to the sequences surrounding the 
switch recombination site. In each case the recombination events 
were found to have occurred within or near regions of remarkably 
internally repetitive DNA sequences 5' to the CH coding sequences; 
these have become known as switch (S) sequences (7 1-74). 



The S region of the mouse \i gene, S|a, is located about 1 to 2 kb 
5' to the C\i coding sequence and is composed of numerous tan- 
dem repeats of sequences of the form (GAGCT) n (GGGGT), where 
n is usually 2 to 5 but can be as high as 17 (74). These repeats 
apparently promote deletions within the Sfi. region by homologous 
recombination events that can occur during the laboratory con- 
struction and isolation of clones containing the S(i region. Because 
of such deletions, most cloned germline \i genes are found on 
£coRI fragments shorter than the 12.5-kb fragment identified in 
genomic blots of BALB/c DNA. Deletions of the same region have 
been demonstrated to occur in vivo by comparison of the ji locus 
in different mouse strains by Southern blotting (75) and may occur 
especially frequently during the activity of switch recombination in 
normal B cells (76). 

Similar internally repetitive S regions spanning 1 to 1 0 kb have 
been found 5' to all the other CH genes except C8. All of the S 
regions include occurrences of pentamers similar to GAGCT and 
GGGGT that are the basic repeated elements of the S\i gene (77); 
in the other S regions these pentamers are not precisely tandem ly 
repeated as in S\i f but instead are embedded in larger repeat units. 
The 10-kb Sy\ region has an additional higher order structure: two 
direct repeat sequences flank each of two clusters of 49-bp tandem 
repeats (78). S regions of human H-chain genes have been found 
very similar to their mouse homologs (79-81). Indeed, sequence 
similarity between human and mouse clones 5' to the CH genes has 
been found to be confined to the S regions, an observation that sup- 
ports the biologic significance of these regions. 

A switch recombination between, for example, |i and e genes 
produces a composite Su,-S : sequence (Fig. 8). By examination of 
the germline S\i and Se sequences in comparison with the 
myeloma- or hybridoma-derived S(i-S£ composite S region, it has 
been possible to localize the exact recombination sites between S(J. 
and Se that occurred in different cells; similar analyses have been 
performed with cells producing other isotypes. These studies have 
indicated that there is no specific site, either in Su, or in any other 
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FIG. 8. Switch regions and composite switch 
junctions. The recombination breakpoints in iso- 
type switch recombination fall within repetitive S 
regions. Stimuli that activate switch recombina- 
tion (IL-4 and CD40 activation in the example 
shown) generally promote transcription across 
the target S region, initiating just upstream at 
the I exon. Recombination between Su and Se 
produces two composite switch junctions: an 
Sji-Se junction retained in chromosomal DNA, 
and a reciprocal Se-Sn junction found in frac- 
tions of circular DNA. PCR amplification across 
either composite junction can be used to study 
switch recombination. 
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S region, where the recombination always occurs. Thus, unlike the 
enzymatic machinery of VJ recombination, the switch machinery 
can join sequences in a broad target region; this makes sense 
because VDJ recombination occurs within coding sequences, 
whereas switch recombination is less constrained because it occurs 
in introns. Many composite switch junction sequences show evi- 
dence of mutations at the recombination breakpoint when com- 
pared with the corresponding germline switch sequences; these 
mutations have been interpreted as reflecting an error-prone DNA 
synthesis step that may be a component of the switch recombina- 
tion mechanism (82). 

DNA excised by switch recombination has been detected by 
cloning from fractions of circular DNA isolated from cells actively 
undergoing isotype switch recombination (83-85). Thus, at least 
some of the excised DNA segments ligate their ends to form switch 
circles; these contain composite switch junctions that are reciprocal 
to the composite switch junction retained on chromosomal DNA 
(Fig. 8). For the example of cells switching from \i to e, composite 
Sn-Se junctions are found on chromosomal DNA, whereas Se-Su. 
junctions can be found representing the reciprocal junctions from 
switch circles. Because switch circles are not linked to centromeres 
and may not contain origins of replication, they are not efficiently 
replicated. Therefore, they are not found in cells that have divided 
multiple times after switching, e.g., in myelomas or hybridomas. 

Methods of Assaying Switching 

In stable myelomas or hybridomas expressing switched isotypes, 
evidence of switch recombination can be obtained by gene cloning 
or Southern blotting. However, for studies of the regulation and 
mechanism of switch recombination, assays are needed that can 
detect switch recombination in a minority population of cells 
switching in culture. Some laboratories assess switching by simply 
measuring Ig protein of the switched isotype appearing in the cul- 
ture supernatant. Alternatively, reverse-transcriptase polymerase 
chain reaction (RT-PCR) can be used to detect mRNA correspond- 
ing to the mature VDJ-C RNA transcripts of the switched isotype. 
However, because the culture conditions favoring isotype switch- 
ing also may influence transcription or protein synthesis rates inde- 
pendently of switch recombination, RNA or protein assays may not 
faithfully reflect the DNA recombination events. Furthermore, 
switched RNA or protein cannot be assumed to reflect DNA 
recombination if one is exploring one of several models for nonre- 
combinational mechanisms for isotype switching. Therefore, two 
different PCR strategies have been developed to assess switch 
recombination at the DNA level. In one strategy, PCR primers are 
designed to amplify across the composite S region of interest (86). 
A related strategy is to amplify the reciprocal switch junctions 
found on circular DNA (87,88); these junctions can be used to 
"count" recombination events independent of proliferation if one 
assumes that each circle is produced as a by-product of a single 
switch recombination event and, failing to replicate as the cells 
divide, is randomly partitioned to daughter cells at successive divi- 
sions after the recombination event. Because the efficiency of 
amplification varies for different composite switch junctions- 
smaller templates are amplified more efficiently, and the largest 
may not amplify at all— the PCR strategy described above cannot 
easily be adapted to assay switch recombination quantitatively. For 
this reason a second strategy known as digestion-circularization 
PCR (DC-PCR) was developed (89). In this approach DNA from 
switching cells is digested with a restriction enzyme, and restric- 



tion fragments— including the ones bearing a composite Su.-Se 
junction— are ligated to form circles. Primers designed to amplify 
across the restriction site generated by ligation of the Su.-Se frag- 
ment ends will yield a consistent product whose size depends only 
on the distance between primers and the restriction site. From 
unswitched DNA no product is amplified because the two primers 
can never both hybridize to the same DNA circle. Therefore, with 
appropriate calibration (90), the amount of DC-PCR product 
formed can be used as a semiquantitative measure of the amount of 
composite switch junctions in a DNA sample. These methods have 
been used in many of the experiments described below. 

Regulation of Isotype Switching 

Isotype switching occurs physiologically in animals about 1 week 
after immunization with T-dependent antigens, at about the same 
time that somatic mutation of Ig genes begins. Somatic mutation 
(discussed later in this chapter) clearly occurs in germinal centers of 
lymphoid organs— a location that facilitates T- and B-cell interac- 
tion—and there is some evidence that germinal centers are a major 
site for isotype switching as well. As demonstrated by in vitro 
switching experiments, T cells promote switching by secretion of 
cytokines (especially interleukin [IL]-4 and transforming growth 
factor-P [TGF-p]) as well as by cell-to-cell contact. A major com- 
ponent of the cell contact signal is mediated by an interaction 
between the B-cell surface marker CD40 and its ligand (designated 
CD40L or glycoprotein [gp]39), expressed on activated T cells. The 
dependence of switching on this interaction is highlighted by the 
genetic disease known as the X-linked hyper-IgM syndrome, which 
was found (independently by several laboratories) to be caused by a 
defect in the gene encoding CD40L/gp39 (91). Patients with this 
syndrome have elevated concentrations of IgM in their serum and 
almost no Igs of other isotypes. In addition, their antibodies fail to 
show affinity maturation or evidence of B-cell memory responses. 
Mouse strains with engineered defects in CD40 or CD40L show a 
similar phenotype, although they respond with normal isotype 
switching to T-independent antigens (92); little is known about this 
T-independent switching pathway. The discovery of the importance 
of the of CD40-CD40L interaction has facilitated in vitro switching 
experiments in which T cells can be replaced with antibodies to 
CD40 or with cells engineered to express surface CD40L. One role 
of the CD40 stimulus is to induce B-cell proliferation. Indeed, other 
proliferative stimuli (e.g., lipopolysaccharide [LPS] or IgM or IgD 
cross-linking) can support cytokine-induced isotype switching in 
vitro in the absence of T cells and CD40 activation; and switching 
may be related to the cell cycle (93). However, CD40 has additional 
effects, including upregulation of IL-4 responsiveness and IL-4 
receptor number (94); the signaling pathways initiated by CD40 are 
under active investigation (95). 

Different isotypes are known to predominate in different immune 
responses depending on the antigen, route of antigen acUninistration 
and several other parameters. As discussed more fully in Chapter 23, 
these different parameters act in part by influencing the cytokine 
milieu of the B cells. IL-4, for example, promotes the expression of 
IgE (and IgGl in mice), whereas TGF-p promotes switching to IgA. 
These lymphokines are believed to act by making the C region of the 
target isotype accessible to switch recombinase machinery that may 
be non-isotype- specific. The accessibility is associated with expres- 
sion of an RNA transcript that initiates upstream of a target S region 
and extends through the target C region (Fig. 8). This type of RNA 
is designated a germline transcript because it is transcribed while the 
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lgH locus is in germline (i.e., unswitched) configuration; alterna- 
tively, these transcripts are called sterile (i.e., lacking a V region). 
After in vitro treatment of B cells with IL-4, for example, but before 
any switch recombination to C£, sterile transcripts are detected with 
a structure that includes Ce preceded by a short exon known as le. 
The le sequence derives from DNA upstream of Se, a location that 
would be deleted during the formation of the Su-Se composite S 
region; in the germline transcripts the Ie region is spliced to Ce by 
removal of an intron containing the Se region. Similar transcripts 
have been found for every isotype examined in both human and 
mouse systems, including \i. In each case many of the same experi- 
mental conditions (including cytokines) that favor the accumulation 
of sterile transcripts from a particular isotype also favor switch 
recombination involving the corresponding S region. In some cases 
the signals transduced by the cytokine receptor have been elucidated. 
For example, IL-4 stimulates transcription by activating the tran- 
scription factor STAT6, which attaches to one of several nuclear pro- 
tein binding motifs in the promoter region upstream of Ie and IgXy, 
as discussed later in this chapter. Apart from I-region promoters, 
sterile transcription and isotype switching are also regulated by an 
enhancer lying downstream of the murine Cot gene, as deduced from 
switching defects in mice in which this enhancer was replaced by a 
neomycin resistance gene in all B cells (96); defects were observed 
in switching to IgE and several IgG isotypes, but not to IgGl. 

Studies of mouse strains in which the I region from various iso- 
types have been targeted by homologous recombination suggest 
that sterile transcription is necessary but not sufficient for recom- 
bination (97-99). The low extent of sequence conservation of the I 
exons and the lack of consistent open reading frames suggest that 
these transcripts do not encode a functional protein. What then is 
their role? One hypothesis is that the critical chromosomal alter- 
ation that renders an isotype locus accessible to the switch recom- 
binase machinery is achieved by the process of transcribing 
through the locus, and that the transcripts themselves serve no 
function. A second hypothesis is that the transcripts participate in 
the recombination event in some way, perhaps by formation of an 
RNA:DNA triple helix (100). In support of this idea, cell-free tran- 
scription of S regions was found to lead to a stable association of 
the transcript RNA with the template DNA (101); significantly, this 
association occurred only with RNA transcribed from S region 
DNA and only when the RNA was transcribed in the physiologic 
orientation. Neither of these two hypotheses concerning the role of 
sterile transcription account well for a feature conserved in all the 
transcripts: the RNA splice that removes the Sx region from the 
mature IxCx transcript. It is noteworthy that sterile transcripts from 
the germline components of V(D)J assembly recombination are 
also synthesized just before that recombination event, and tran- 
scription is also observed from rearranging yeast DNA sequences. 
These observations suggest that the transcription of DNA immedi- 
ately before recombination may be a general feature of recombina- 
tion events common to many biologic systems. On the other hand, 
it is likely that cytokines regulate other aspects of the switching 
mechanism besides sterile transcription because several examples 
have been reported of cytokines up- or downregulating switch 
recombination without a parallel effect on sterile transcripts (102). 

Mechanism of Switch Recombination 

The mechanism of isotype switch recombination has been 
probed with a variety of strategies, so far with limited results. One 
approach to delineating the sequences required for switch 
rearrangement has been to construct plasmid substrates containing 



switch sequences that might undergo switch recombination when 
transfected into B-lineage cells either stably (103) or in transient 
systems (104). For example, the construct of Daniels and Lieber 
(105) contained the polyoma origin and T-antigen gene (to allow 
replication in murine cells) and fragments of S|X and Sy3 segments, 
with viral promoters upstream of each and a supF transfer RNA 
(tRNA) gene between them; expression of the supF tRNA gene in 
appropriately engineered bacteria led to blue colonies on culture 
plates. Plasmids undergoing Sp.-Sy3 recombination in eukaryotic 
cell lines could be recovered and identified by the production of 
white colonies in bacteria. Although various nonlymphoid cell 
lines produced white colonies within the first 20 hours of transfec- 
tion (perhaps resulting from DNA repair enzymes acting on nicked 
plasmids), continued increases in the percentage of white colonies 
beyond 20 hours appeared to be B cell specific. Deletion of the 
promoters had only minor effects on the recombination frequency, 
but white colonies were dramatically decreased when the promot- 
ers were arranged so that the S regions were transcribed in the non- 
physiological direction. When the S regions were replaced with 
irrelevant DNA, the direction of transcription had no effect on 
recombination. The dependence of recombination frequency on 
transcriptional orientation of switch sequences parallels findings 
described above in which the RNA-DNA complex involving S 
regions was strand dependent. 

Another strategy for elucidating the switch recombinase mecha- 
nism is to identify intermediates in the reaction, an approach that 
has. been strikingly successful in studying VDJ recombination, as 
discussed later in this chapter. A single study exploring this 
approach has used ligation-mediated PCR (a technique described 
later in connection with VDJ recombination) to detect blunt, dou- 
ble-stranded cuts in the murine "ft region in B cells switching in 
culture (106); these cuts may be generated by the switch recombi- 
nase machinery. 

Possible Switch Recombinase Components 

In an effort to identify components of the recombinase machin- 
ery, several laboratories have investigated proteins that bind in a 
sequence-specific manner to S-region sequences in vitro. Several 
examples are described below, although it should be emphasized 
that none of the components discussed in this section has been 
demonstrated to participate in switch recombination. LR1 is a pro- 
tein found in nuclear extracts from murine splenic B-lymphocytes 
after induction with LPS; it binds to Syl, Sy3, and Sec, as well as to 
the H-chain enhancer (107). The protein has been purified (108), 
and one component has been identified as the nucleolar protein 
nucleolin (109). S|Xbp-2 is a ubiquitous protein, also upregulated 
by LPS in murine splenic B cells, which binds to a segment from 
the tandem repeats in S\i. A murine cDNA clone was found to 
exhibit sequence similarity to genes encoding helicases; such an 
activity could be critical for switch recombination (110). NF-S|i is 
another protein that binds to S\l tandem repeats and is induced in 
splenic B cells by LPS; its binding specificity is slightly different 
from the other proteins already described (111). Two proteins that 
bind within Sy regions to subsequences associated with a high fre- 
quency of recombination junctions have been designated SNIP and 
SNAP and apparently correspond (respectively) to the transcription 
factors NF-KB/p5Q and E47, which are discussed later in this chap- 
ter (1 12,1 13). A. role for NF-kB in switching is supported by exper- 
iments in B cells from a mouse strain in which the p50 gene has 
been disrupted by homologous recombination. In these p50 knock- 
out mice, isotype switching to IgE and IgG3 secretion was 
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markedly reduced; however, reduced expression of the correspond- 
to Ker mline transcripts could indicate that the P 50 was required 
for promoting "accessibility" rather than for the actual recombina- 
tion event (114). In these experiments switching to IgGl expres- 
sion was almost unaffected by the absence of P 50, and a was the 
onlv isotype whose expression was markedly reduced m the face of 
normal germline transcription. A possible role for E47 in switching 
is supported by experiments in which expression of Idl, an antag- 
onist of the E2A transcription factors of which E47 is a member, 
was found to partially inhibit spontaneous and induced switching 
to IkA in the murine cell line CH12.LX2 (1 1 5). 

Apart from studying proteins that bind to S-region DNA, 
another approach to identify switch recombinase components has 
been to search for an enzyme activity expected to participate m the 
recombination. A lymphoid-specif.c endonuclease activity that 
preferentially cleaves G-rich segments of S regions has been par- 
Lily purified and proposed as a possible participant in switch 

recombination (116). 

The possibility that switch recombination depends on some< of 
the same components that are known to participate m V(D)J 
recombination has been tested for several proteins (whose role in 
V(VV recombination is discussed later in this chapter). Both SC1D 
mice which are natural mutants of DNA-dependent protem kinase 
(DNA-PK), and mice with homozygous knockouts of their recom- 
binaton activating gene-2 (RAG-2) genes are impaired in develop- 
ing mature B lymphocytes because of their inability to assemble V 
genes efficiently. However, when early B-lineage cells from these 
Lee were allowed to proliferate in vitro and were then treated with 
IL-4 and anti-CD40, switch recombination occurred in the RAO-/ 
knockout cells but not in the SCID cells (116a). Thus DNA-PK 
appears necessary for switch recombination but RAG-2 does not. 
DNA-PK binds DNA as part of a complex that also contains the 
protein Ku80 (also discussed later in this chapter). Recently, Ku80 
was also implicated in switch recombination in experiments in 
which Ku80 knockout mice were crossed with mice in which 
recombined VkJk and VDJ genes were "knocked in" to the respec- 
tive loci by homologous recombination. Whereas "knock-m mice 
with intact Ku80 genes expressed IgM encoded by the engineered 
Vk and VH genes and also switched to downstream isotypes, the 
corresponding Ku80-deficient mice made IgM but did not switch 
- isotypes, suggesting that Ku80 is also required for switch recombi- 
nation (116b). 

A recent achievement that holds promise for identifying recom- 
binase proteins is the development of a cell-free nuclear extract 
system that can accomplish recombination between S-region 
sequences in vitro (1 17,1 18). This system depends on a powerful 
assay in which tritium-labeled plasmid molecules containing Sy are 
incubated with digoxigenin-labeled plasmid containing Su. Re- 
combination between the plasmids is detected as tritium immuno- 
precipitable by antidigoxigenin, and the recombinant DNA structure 
can be verified by PCR amplification across the composite switch 
junctions. Optimal recombination was found to require adenosine 
triphosphate (ATP), both Su and Sy, and nuclear extract from LPS- 
blasted B cells. Partial fractionation of nuclear extracts idenufied an 
active complex designated SWAP (switch activation proteins) com- 
posed of at least four proteins: nucleophosmin (which has a RecA- 
like DNA D-loop forming activity), poly(ADP)ribose polymerase 
(PARP, a nuclear protein implicated in DNA repair), nucleolin 
(described above as a component of the Su-binding protem LR1), 
and a novel 70 kD protein designated SWAP-70, not homologous to 
any known protein family (118). SWAP-70 is strongly expressed 
only in B cells that have been activated for switch recombination, 



and binds with high affinity to the other components of the complex. 
The SWAP complex is a strong candidate for a switch recombinase 
component, but as of this writing, the definitive evidence from a 
knockout experiment is not yet available. 

Nonstandard Switch Recombination 

Thus far we have considered switch recombination to involve a 
simple deletion of the DNA between two S regions; although this 
is the most common scenario, three additional situations should be 
considered for completeness. 

Sequential Switching. Several switch recombination events can 
occur sequentially on a given chromosome. One well-studied 
example involves sequential switching to 7 1 followed by e in 
mouse B-lymphocytes. The same cytokine, IL-4, promotes switch- 
ing to both isotypes. After an initial switch recombination generat- 
ing a composite Su-S Y 1 junction (leading to IgGl expression), this 
composite S region can undergo a secondary switch recombination 
with SE, which lies downstream. In IgE-expressing cells, evidence 
of the initial recombination to Y l can be demonstrated by the pres- 
ence of a composite Su-S Y 1-Se junction (1 19), or by the detection 
of the reciprocal switch circle product Se-S Y 1. To assess the quan- 
titative importance of this pathway in IgE generation, ntug B 
cells stimulated with IL-4 plus LPS were treated with an ant,- gGl 
antibody to eliminate cells expressing this isotype from the culture 
IbE secretion was inhibited about 70%, suggesting that most 
mouse B cells expressing IgE have undergone an intermediate 
stage in which they expressed IgGl (120). However ui mutant 
mice with a block in Y l switching due to a targeted deletion in the 
Yl locus, the frequency of switching to e is normal, suggesting that 
the sequential switching results from the simultaneous accessibil- 
ity of both S Y 1 and Sc. rather than an obligatory sequential switch 
oroeram (121). Sequential switching to IgE expression via IgU 
also occurs in human B cells (81,122), but the quantitative signifi- 
cance of this pathway is not known. 

Inversion* Recombination. Some switch recombinations appar- 
ently lead to inversion rather than deletion of the DNA between the 
two S regions involved (123,124). A chromosome with an urver- 
sional switch recombination would be incapable of encoding a func- 
tional H chain because the C region downstream of the VD I regi on 
would be in inverted orientation, but the chromosome could be res- 
cued" by a second switch recombination to a downstream constant 
region In human- B-cell leukemlas-which are under no selection 
for Ig production— inversional switch recombination has reported 
to occur at a frequency of about 1 5% (125). 

Trans-Switching. Although most switch recombinations involve 
a single chromosome, transchromosomal switchmg between allelic 
chromosomes has been detected in rabbits at a frequency of about 
5% (1261 The detection of trans-switching in rabbits was facili- 
tated by me availability of allotypic markers of C and V regions in 
this species; the frequency of trans-switching in other species is not 
known. 

Switched Isotypes Without Switch Recombination 

Several laboratories have reported detection of B cells express- 
ing Ig of more than one isotype. Such double-producing cells may 
reflect a normal transient intermediate stage when a switched iso- 
type may be expressed (after normal switch recombination) along 
with IgM that is retained in the cells because of the long half-life 
of the protein or its mRNA. However, some laboratories have 
reported a stable double-producer phenotype in cell lines without 
apparent switch recombination in the expressed IgH locus. For the 
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case of JJ.-5 double producers, the explanation is apparently that 5 
transcripts can be produced by RNA splicing from a long primary 
transcript that includes u\ and 8 (127). More difficult to explain are 
the cell lines expressing u. along with an isotype whose C region is 
so distant from Cu\ that an analogous long transcript would be on 
the order of 100 kb or more; such transcripts are longer than have 
been observed with current laboratory methods, although prece- 
dents for genes whose exons are spread over similar distances are 
known. One interesting proposal being considered to explain 
expression of downstream isotypes without switch recombination 
is that separate short transcripts of VDJ and a downstream CH gene 
(i.e., a sterile transcript) could be joined by a trans-splicing mech- 
anism similar to that documented for trypanosomes and certain 
viruses (128,129). A nonphysiologic mechanism has been 
described that could account for some cases of double isotype pro- 
duction as a consequence of chromosomal duplication (130). Sta- 
ble double-producing cell lines continue to be studied (131-133); 
at present we cannot be certain whether cells stably expressing this 
phenotype represent important physiologic counterparts of normal 
B cell subsets. A semiquantitative assessment of switch recombi- 
nation in a population of murine B-cell switching in jvitro to IgGl 
indicated that DNA rearrangement can account for the IgGl 
expression observed (90), suggesting that most expression of 
switched isotype Ig is associated with switch recombination, and 
that alternative nonrecombihational models for switching do not 
seem to be required on quantitative grounds. 

Kappa Light-Chain Genes 

In comparison with the H-chain genes, the k locus is relatively 
simple. A single Ck gene with a single exon and no reported alter- 
native splice products is found in both mice and humans. 
Upstream of the murine Ck gene lie five Jk gene segments, spaced 
about 0.3 kb apart (5,6). Of these Jk segments, the third encodes 
an amino acid sequence never observed in K chains and is believed 
to be nonfunctional owing to a defect in the splice donor site that 
would join the corresponding RNA sequence to Ck. The human 
locus is similar, with five Jk regions upstream of Ck; however, no 
homolog of the defective murine Jk3 is present in the human Jk 
cluster, whereas an additional Jk sequence lies downstream of the 
sequence homologous to murine Jk5 (134,135). Upstream of the 
Jk segments in both species lie the Vk genes, which will be 
described later in this chapter. 

Apart from Vk-Jk rearrangement, an additional recombination 
event occurs in this locus, a recombination unique to K genes and 
apparently mediated by the same 7-mer/9-mer signal elements 
involved in V(D)J recombination. This event, which involves dele- 
tion of the Ck gene segment, was initially suggested by the obser- 
vation that Southern blots of DNA from ^.-expressing human lym- 
phoid cells generally show no detectable Ck sequence (136). 
Apparently in most B cells the Ck genes are deleted from both 
chromosomes before X gene rearrangement begins. When the 
boundaries of the deleted segment of DNA were examined in sev- 
eral human and mouse cell lines, a common sequence element was 
found at the downstream boundary; this element was designated 
RS (recombining sequence) in the mouse studies (137) and Kde 
(kappa-deleting element) in the human studies (138). The human 
Kde in germline DNA is located 24 kb downstream from the Ck 
gene and is flanked by a 7-mer/9-mer RSS similar to that found 
flanking the Jk regions (i.e., with a 23-bp spacer) (139). The simi- 
lar murine RS is about 25 kb downstream from murine Ck (140). 
The Kde element can apparently recombine either with a Vk gene 



segment (leading to a deletion of the entire Jk-Ck locus) or with an 
isolated 7-mer element that is located in the Jk-Ck intron (leading 
to deletion of Ck but retention of the Jk locus). The 7-mer in the 
Jk-Ck intron is 30 bp 5' from a poorly conserved 9-mer-like 
sequence, a spacing that seems to violate the usual 12/23 rule. The 
significance of this unusual spacer is not understood, but possibly 
the 7-mer in these recombinations is active without a functional 9- 
mer, as seems to be the case for secondary VH recombinations 
(discussed in a later section). 

A comparison between the mouse RS and human Kde sequences 
(138,139) shows that the recombination signals are highly con- 
served and that downstream of these elements a region of about 
500 bp is partially conserved (about 50% sequence identity). The 
latter region includes open reading frames of 127 (mouse) or 102 
(human) codons. It is not known whether these reading frames are 
ever expressed as protein as a consequence of the RS/Kde recom- 
bination events, but the fact that the recombination may occur with 
either a Vk region or intron sequence suggests that the sequences 
joined by the event may be less important than the sequences 
deleted. RS/Kde elements are consistently found to be rearranged 
in cells in which Ck is deleted and X rearrangements are found; this 
has led to the speculation that the RS/Kde recombination event may 
mediate the developmental switch from k to X gene rearrangement, 
perhaps by deleting a gene for a negative regulator of X gene 
rearrangement. However, current evidence argues against this view. 

Lambda Light-Chain Genes 
Murine A Locus 

In laboratory mouse strains, X chains represent only about 5% of 
L chains, and this diminished abundance is associated with remark- 
ably meager diversity. In contrast to the k system with its multiple 
V-region families, amino acid sequence analysis of monoclonal X 
chains detected only two sequences that appeared to represent 
germline VX regions. Furthermore, in contrast to the single mouse 
Ck region, three nonallelic mouse isotypes are known from 
secreted X chains; these are designated Xl, X2, and X3, in decreas- 
ing order of abundance. 

The first X gene to be cloned was a germline VX2 gene obtained 
by Tonegawa's laboratory in 1977 (141) (this was the first Ig gene 
cloned). The sequence of this VX gene (7) showed structural fea- 
tures that are similar to those of other germline VX genes, as well 
as Vk and VH genes, which were discovered later. The VX2 coding 
sequence begins with a 1 9-amino-acid signal peptide that is inter- 
rupted within codon 4 by an intron (which was one of the first 
introns demonstrated). After the remaining signal peptide codons, 
the DNA sequence matches closely that expected based on amino 
acid sequence determined chemically from a XI myeloma L chain. 
However, the sequence of this germline VX2 gene ends abruptly 13 
codons short of the expected end of the VA2 region, an observation 
that led to the first recognition of a separately encoded J region. 

Cloning and long range mapping studies by pulsed field gel 
electrophoresis (142,143) have led to a substantial understanding 
of the mouse X locus (Fig. 9). There are four CX genes, each with 
its own JA.-region gene located about 1 .3 kb 5 ' from the C The J-CX3 
and J-CXl genes are arranged in one cluster about 3 kb apart with 
the VX\ gene lying about 19 kb upstream. A second CX cluster lying 
about 130 kb upstream from the CX3-1 locus contains J-CA2 and an 
unexpressed gene J-CX4. These are flanked by two upstream YX 
genes, VA2 and the rarely used Vx, which has an in-frame termina- 
tion codon at its 3' end (144). The gene order (V2-Vx-JC2-JC4-Vl- 
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JC3-JC1) explains the common expression of VX2 (or Vx) in associ- 
ation with CX2 and VXl with CXI or CX3. The VX2 has been found 
in rare association with the 190-kb distant CXl locus, but the back- 
ward recombination of VX1 with CX2 has not been observed. The 
similarities between the four J-C genes Suggest that the two clusters 
arose by a duplication of an ancestral VJ-CXx-J-CXy unit that in turn 
was the result of a prior J-CA. duplication event The ancestry of the 
Vx gene is uncertain because this gene is rather dissimilar to the 
other VX genes; indeed, it resembles Vk as much as VX. Anti-VXx 
antisera detect expression of this VX in all laboratory mice tested, but 
it may have a particular restricted function. 

The sequences of genes in the X locus have been examined for 
clues that might explain the relative abundance of their expressed 
products — XI > X2 A3 > > > (A4) (145). The sequence of CX4 
includes several amino acid substitutions but no termination codons 
that would necessarily render it nonfunctional; however, at the 3' end 
of JX4, a mutation has destroyed the "GT . . found at almost all 
known donor splice sites, so that an RNA transcript of this gene 
would not be properly processed (reminiscent of the mouse Jk3). The 
JX gene segments are all flanked on their 5' sides by . sequences sim- 
ilar to the 9-mer and 7-mer signal elements observed in the VH and 
Vk system. The 12/23 rule discussed in relation to spacing between 
the signal elements in the K locus also applies to X, but in the X locus 
the RSS elements are spaced about 23 bp apart for V regions and 
about 12 bp apart for J regions (the opposite of the arrangement in K 
genes), as shown in Fig. 4. The decreased abundance of X2 and X3 
relative to XI may be related to discrepancies between their 9-mer 
homology elements and the consensus 9-mer element 

Analyses of X genes in wild mice by Southern blotting have indi- 
cated more complex and varied loci than that seen in typical labo- 
ratory strains (146). These complex X loci may result from gene 
duplication events beyond those evident in laboratory strains, 
although the observation thai at least one wild VX gene missing 
from BALB/c is similar to a human VX (146) suggests that some of 
the difference between wild and laboratory strains must be due to 
gene loss in that latter. 



Human A Locus 

Lambda L chains are much more abundant in humans than in 
mice (about 40% of human L chains versus about 5% in mice). 
Furthermore, four isotypic forms of human X chains have been 
characterized, known by their original serologic designation as 
Kern-Oz-, Kern-Oz+, Kern+Oz-, and Meg; several other variants 
have been described, perhaps representing allelic polymorphisms. 

Seven human JX-CX segments are clustered within an approxi- 
mately 33-kb region of DNA that has been entirely sequenced 
(147-149). As shown in Fig. 9, genes for the four major 
expressed human X isotypes have been localized within the major 
cluster and correspond to JCXl, JCX2, JCX3, and JCX7. The 
remaining three homologous J-C segments are apparently 
pseudogenes, with either in-frame stop codons or frame-shifting 
deletions. However, JCX6 may be functional in some individuals 
(150), and the common allele — which has a 4-bp insertion lead- 
ing to a deletion of the C-terminal third of the CX region — can 
nevertheless undergo VX-JX recombination, encoding a truncated 
protein that can associate with H chains (151). A variety of poly- 
morphic variants of the human X locus have been detected, appar- 
ently the result of gene duplication; as shown in Fig. 9, one to 
three extra X segments have been detected on Southern blots of 
human DNA (152). 

Three CX- related sequences have been discovered near the major 
JX-CX cluster. One of these, designated X14.1, represents the 
human homolog of the murine surrogate L chain X5. Finally, an 
additional weakly hybridizing DNA segment outside the linked 
cluster has been characterized as a processed pseudogene (1 53). V 
genes of the human X system have been completely characterized, 
as discussed in a later section. 

X-Related Surrogate Light Chains 

Immunoglobulin u, H chains can be detected on the surface of 
pre-B cells that do not make L chains. However, in mature B cells, 
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FIG. 9. Germline X genes. The maps in this figure are schematic, i.e., not to scale. A: The murine X gene system 
includes four JC complexes and three V genes, which have been characterized in two unlinked contig (sets of over- 
lapping clones) as shown. B: The human X locus has been characterized by complete sequence analysis. The 
human VpreB surrogate L-chain gene is located within the VX. cluster. The CX locus includes a segment of seven JC 
complexes plus three additional unlinked sequences. The hatched JC complexes diagrammed above the seven 
linked X sequences represent polymorphic variants with additional duplications of the JC unit as deduced from 
Southern blots. The 14.1 sequence — the human X5 surrogate L-chain homolog — lies downstream from the JC clus- 
ter but its location relative to the other Mike sequences is not known. Exon 1 of the 14.1 gene is homologous to a 
exon upstream from JX1 (gray rectangle). 
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lg H chains cannot reach the cell surface if L-chain synthesis is 
interrupted. What allows the expression of surface H chain in pre- 
B cells in the absence of L chain? 

The first clues to this question were uncovered in a search for 
genes whose expression is specific to the pre-B stage of lympho- 
cyte development. Melchers and colleagues identified one such 
gene that demonstrated striking sequence similarity to the J and C 
regions of the X locus; they named it X5 because four murine CX 
genes were already known (154,155). The genomic gene 
includes three exons (Fig. 10): exon 1, which appears to encode a 
signal peptide; exon 2, whose 3' end is homologous to JX; and 
exon3, homologous to CX. When flanking regions of the genomic 
X5 clone were tested as probes against pre-B cell mRNA, another 
transcribed segment was found about 4.7 kb 5' from (156). (Fig. 
1 0.) Sequence analysis of the latter region showed similarities to both 
VX and Vk; for this reason (and because of its expression in pre-B 
cells) it is called VpreBl. A second, nearly identical sequence in the 
mouse genome is named VpreB2 (157) and appears to be functional 
(158); a less similar VpreB3 also has been described (159). Neither 
X5 nor VpreB genes show evidence of gene rearrangement in B or 
pre-B cells. Both genes have typical consensus splice sites and initi- 
ation and termination codons and have no apparent defects that 
would prevent their expression as proteins. That they are expressed 
and serve an important role is suggested by the conservation of 
homologs in every mammalian species examined. 
. Evidence strongly supports the notion that these genes encode 
surrogate L chains (SLCs) that associate with |i H chains to permit 
surface u. expression before the availability of L chains. Thus, when 
a (i. H-chain gene was transfected into an Ig-negative myeloma line, 
no surface (I expression was observed unless 7^5 and VpreB genes 
were also transfected (160). The surface u. chains were found to be 
covalently linked to the 22-kDa product of the X5 gene, whereas 
the 16-kDa VpreB product was noncovalently associated. A similar 
complex is observed in pre-B cell lines and in normal bone marrow 
pre-B cells (161). The V-like VpreB gene product [also known as i 
(162)] apparently associates with the CA.-like X5 product (also 
known as Co) to form an L chain-like heterodimer that can fulfill 
some functions of a true L chain. 

One likely role for a |i-SLC complex is suggested by the obser- 
vation that most Vk-Jk recombination occurs only in cells express- 



ing a functional u. H chain (as discussed more fully in the scctio 
on regulation of V(D)J recombination); apparently U,-SLC expre? 
sion on the cell surface can trigger the onset of Vk-Jk rearrange 
ment. Evidence for this view comes from experiments in which 
pre-B line that normally does not rearrange its k locus was trans 
fected with a construct encoding the membrane form of (X H chai 
(163); when the transfected u. gene was expressed in a comple 
containing VpreB and X5, Vk rearrangement was induced. In cor 
trast, surface expression of a deleted u\ gene (fiAm) — which lacke 
VH and CHI and which did not associate with SLC — was ineffec 
tive in inducing Vk rearrangement unless the |iAm protein wa 
crosslinked by an anti-u. antibody. These results suggest that th 
SLC may facilitate cross-linking of surface p. chains in pre-B cell: 
a necessary step before the B cell can proceed to Vk rearrange 
ment, L-chain synthesis and mature Ig production. Further suppoi 
for such a critical role is discussed later in this chapter. 

Human homologs of both X5 and VpreB have been clonec 
Three A.5-like sequences are located downstream from the C 
cluster on chromosome 22 (164), but only one (designated 14.1 
appears to be functional, possessing the three-exon structure of X 
(165-167), Interestingly, a sequence upstream of JXl is homolc 
gous to exon 1 of 14.I/X5, suggesting that 14.1 and JX-CXl ma 
have had a common ancestral gene that could be expressed i 
either of two ways: (a) by rearranging its J-like exon 2 with a \ 
region gene, like modern X genes; or (b) without rearrangemen 
using exon 1, with the encoded protein assembling with a nor 
covalently linked VpreB -like subunit. The human VpreB homolo 
lies within the VX cluster (168), in contrast to murine VpreE 
which lies close upstream of X5. 



V GENE ASSEMBLY RECOMBINATION 

The mechanism by which germline variable-region constituenl 
(VL and JL, or VH, D, and JH) assemble in the DNA to form 
complete active V gene has been pursued ever since Ig gene recorc 
bination was first discovered. In this section we address (a) th 
topology of the recombinations from a "macro" viewpoint, (b) th 
components of the recombinase machinery (a "micro" view), an 
(c) the regulation of that machinery in B-cell development. 
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FIG. 10. X-related genes that encode a surrogate L chain. The top 
line of the diagram portrays the exons of the VpreBl gene and the 
X5 gene of the mouse, which have been physically linked as shown. 
The second line shows sequence similarity relationships with other 
known Ig or TCR sequences. The expressed mRNAs and proteins 
that have been detected from these two genes are shown below. 



Topology of V Assembly Recombination 

Deletion Versus Inversion 

The earliest model for Vk-Jk rearrangement assumed that V seg 
meats and J segments were all oriented in the same direction c 
transcription and that the DNA between the recombining V and 
segments was simply excised and lost from the cell (Fig. 11 A 
However, Southern blotting of a panel of myelomas and normal * 
bearing lymphocytes showed that some cells had retained the DNi 
just upstream from JkI, a region that should have been absent froi 
all chromosomes that underwent deletional recombination (169* 
Although several complex models were proposed to explain sue 
results, the presently accepted explanation is simple: some V 
genes are oriented in the opposite direction from the Jk-Ck regioi 
This topology would allow the VJ recombination to occur by a 
inversion of the DNA between the recombining V and J segment 
(Fig. 1 IB), leaving the DNA upstream from JkI retained on th 
chromosome. The same recombinase machinery can presumabl 
rearrange the germline elements by either inversion or deletion- 
depending on the relative orientations of the sequences — becaus 
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FIG. 11. The same micro mechanism of recombina- 
tion can join Vk and Jk by deletion or inversion, 
depending on the relative orientation of the two pre- 
cursors in germline DNA. A: When the V coding 
sequence {shaded rectangle) and the J coding 
sequence (white rectangle) are oriented in the same 
5'-»3' direction in germline DNA (internal arrow- 
heads), the recombination yields a VJ coding joint 
plus a DNA circle containing the signal joint 
(apposed triangles), B: If V is oriented in the opposite 
direction in germline DNA, then an identical recombi- 
nation reaction at the micro level (inside shaded cir- 
cle) leaves the signal joint linked to the recombined 
VJ coding joint. 



this enzymatic machinery "sees" only the DNA in the immediate 
vicinity of the recombination site (circled in Fig. 1 1) and is insen- 
sitive to the topology of the DNA strands far from this site. One 
implication of this model (Fig. 1 1 B) is that cells that have under- 
gone an inversional Vk-Jk recombination should retain on the 
chromosome a recombination joint with two sets of signal 
sequences — the RSS from downstream of the Vk and the RSS from 
upstream of the Jk segment — joined together. Indeed such signal 
joints (also known as flank products and reciprocal joints) have 
been detected in several cell lines (170-172). In contrast to the 
flexibility observed in the position of the recombination breakpoint 
in the VJ segment (the coding joint), the sequences of signal joints 
usually show the J-derived 7-mer joined directly to the V-derived 7- 
mer, without even a single intervening nucleotide between them. 
Surprisingly, the signal joints retained on the expressed chromo- 
some have almost always been derived from JkI. As additional evi- 
dence that inversion can occur in Vk-Jk recombinations, several 
laboratories (173,174) have reported that engineered gene con- 
structs carrying Vk and Jk recombination signals in opposite ori- 
entation can undergo recombination by inversion when transfected 
into a B-lymphoid cell line. 

The idea that some germline Vk genes are oriented opposite to 
the Jk-Ck locus has been directly verified for the most J-proximal 
human Vk gene segments and for one of the two large duplications 
in the human Vk locus, as discussed in a later section. For the 
mouse Vk locus, less is known about V orientation; but the obser- 
vation of signal joints that are retained on the chromosome in 
murine cells can most easily be explained by inverted murine Vk 
genes (175). On the other hand, in the H chain or lambda loci there 
is no evidence for inverted V genes or retained signal joints, so 
these ioci probably recombine only by deletion. 

When the recombination occurs by deletion, the model of Fig. 
1 1 A suggests that a signal joint is formed on a circular DNA mol- 
ecule; such a DNA circle would not be attached to the main chro- 
mosome and, lacking an origin of replication and a centromere, 
would be expected to be lost from the progeny of the cell in which 
the recombination took place. Nevertheless, by isolating circular 
DNA from cells that are undergoing Vk-Jk rearrangement, it has 
been possible to detect the predicted molecules bearing signal 
joints (176), supporting the model. 



Secondary Recombinations 

A final issue for consideration of V assembly topology at the 
macro-level concerns secondary V gene recombinations. As dis- 
cussed in an earlier section, the flexibility of VJ or VDJ joining 
causes nonproductive out-of-frame recombination with high fre- 
quency. A B-lymphocyte that rearranged its K genes nonproduc- 
tively on both parental chromosomes might be thought to have no 
further avenue for making a functional L chain; but the availability 
of upstream V genes and downstream J segments could allow addi- 
tional recombinations to occur, as shown in Fig. 12A. More com- 
plex events are possible as a consequence of the inverted orienta- 
tion of some Vk genes. The occurrence of such secondary 
recombinations has in fact been reported for K genes (172) and 
would be implied by the recovery of chromosomal signal joints that 
are not reciprocal to coding joints in the same cells. The prepon- 
derance of JkI -derived nonreciprocal flank products observed in 
myelomas may result from initial nonproductive recombinations 
between this J segment and inverted V genes, followed by succes- 
sive recombinations involving RtOfe downstream J segments; by 
the time a productive rearrangement occurs, many myelomas will 
carry signal joint relics of earlier recombinations involving JkI 
(177). In addition to lymphocytes with nonproductive VkJk junc- 
tions on both chromosomes, cells that have assembled a productive 
VkJk joint may undergo secondary recombination if the resulting 
VH-VL pair recognizes ■ an autoantigen; this type of secondary 
recombination, known as receptor editing, is considered in more 
detail later in this chapter. 

For H-chain genes the possibility of secondary recombination 
might seem to be ruled out by the fact that a VDJ rearrangement 
must eliminate all the 1 2-bp spaced signal elements from the VH 
locus because these elements are deleted on both sides of the D 
region that is retained in the recombined VDJ unit, and from all the 
germline D segments eliminated by the VD and DJ recombination 
events (Fig. 1 2B). Secondary DJ rearrangements should be possible 
before VHD recombination removes unrearranged upstream DH 
segments (Fig. 12B) and indeed this has been shown to occur (178). 
Of greater functional interest has been the demonstration (1 79) that 
upstream germline VH genes can recombine with an established 
VDJ unit, displacing the originally assembled V gene. This type of 
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FIG. 12. Secondary recombinations. A: In the 
k L-chain system, a primary recombination 
can be followed by recombination between an 
upstream V and a downstream J. B: Analo- 
gous secondary recombinations can occur in 
the H-chain system between upstream D and 
downstream J segments. After VDJ recombin- 
ation eliminates all short spacer signal ele- 
ments from the chromosome, secondary re- 
combination can still occur between VH (long 
spacer signal) and an internal heptamer within 
the VH coding sequence of the VDJ unit. 



recombination is apparently mediated by a sequence that closely 
matches the consensus signal 7-mer and that appears near the 3' 
end of the coding region in about 70% of VH genes (Fig. 1 2B) 
(179,180). The internal 7-mer is not found in most L-chain genes. 
After VH replacement, the few nucleotides remaining from the orig- 
inally assembled VH could potentially contribute to diversity; such 
nucleotides would be difficult to distinguish from N-region 
nucleotides. Secondary recombination thus represents an escape 
mechanism for cells with nonproductive rearrangements on both 
H-chain chromosomes, or, as alluded to above, for cells whose 
antibody encodes an autoantigen (181); it is not known, however, 
how frequently such escapes occur in these circumstances, as 
opposed to the alternative path of cell death. The fact that the iso- 
lated 7-mer is apparently able to function in VH replacement recom- 
binations without an associated 9-mer again suggests that the 7-mer 
is the more critical recombination signal, although it has been sug- 
gested that an additional consensus sequence upstream of the inter- 
nal 7-mer (181) may contribute to VH replacement recombination. 

A Micro View of the Mechanism 

of V Assembly Recombinase Machinery 

As mentioned earlier, the same recombinase machinery is 
believed to mediate V gene assembly recombinations of all four 



types in the Ig gene systems (k, K and VH-D and D-JH) as well as 
similar events in the four TCR gene loci. This belief is based on the 
observation that all these systems share the same 7-mer/9-mer RSS 
and follow the same 12/23- spacer rule of recombination. Further- 
more, gene constructs designed to test in vitro recombination of 
TCR gene segments were found to be accurately recombined when 
trans fected into B cells. The severe combined immunodeficient 
(SCID) mouse strain was found to have a deficiency in recombina- 
tion of both Ig and TCR genes, suggesting that both systems could 
be affected by a single gene defect (182). Finally, the two recombi- 
nation activating genes, RAG- 1 and RAG-2, have been found to be 
key mediators of both Ig and TCR gene recombination. The 
assumption that the same recombinase machinery operates on 
these two gene families has allowed investigators to pool knowl- 
edge concerning the mechanism of the recombination gained from 
studies of both systems. On the other hand the assumption of a 
common recombinase raises the question of how B cells preferen- 
tially rearrange Ig genes (and T cells TCR genes) when both gene 
systems are available to be rearranged by the common recombinase 
in both cell lineages; this issue will be addressed later in this sec- 
tion. The mechanism of V gene assembly has been investigated by 
several different strategies: sequence analysis of normal substrates 
and products (germline and recombined DNA), the use of plasmid 
substrate constructs capable of recombination after transfection 
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into lymphoid cells (in order to assess the effects of alterations in 
the substrate sequences), the study of presumed intermediates in 
the enzymatic reaction, purification of proteins that bind to RSS 
motifs or that perform enzymatic functions hypothesized to occur 
during the recombination, studies of mutations that affect the effi- 
ciency or fidelity of VDJ recombination, and, ultimately, cell-free 
in vitro studies using cell extracts or putative components of the 
recombination machinery. 

Recombination Mode! 

A model for the detailed mechanism of the recombination event 
must account for the observed features of the recombination prod- 
ucts — the coding and signal joints — and of their germline precur- 
sors. The features in the germline precursors that appear necessary 
and sufficient for recombination are the 7-mer and 9-mer RSS with 
appropriate spacing (12 and 23 bp); model substrates containing 
these elements are competent to undergo recombination even in the 
absence of normal V, D, or J coding regions, although the effi- 
ciency of recombination can be influenced by features of the 
sequences replacing the coding regions. As for the products, the 
features of the signal joints are relatively simple: the 7-mers are 
joined "back-to-back" with only rare additions or deletions. The 
features of the coding joints are more complex, due to the flexibil- 
ity of junctions as discussed earlier: 

1 . A variable number of bases are deleted from the ends of the 
coding regions (in comparison with the "complete" sequence 
in the germline precursor). 

2. Nongermline nucleotides (N regions) unrelated to the germline 
precursor sequences are added in some coding joints; these are 
generally rich in G and C nucleotides. 

3. Less frequently, extra bases are added that can be interpreted 
as P nucleotides; these are nucleotides that are joined to the 
end of an undeleted coding sequence and that form a palin- 
drome (P) with that sequence end (183,184). P nucleotides are 
generally only 1 or 2 bp, but they can be longer, especially in 
mice with the SCID defect (185). 

The recombination model first proposed by Alt and Baltimore 
(186) accommodates many of these observations and, with some 
recent modifications, can serve as a framework for consideration of 
the recombination mechanism (Fig. 13). The recombination is 
thought to begin with binding of components of the enzymatic 
recombinase machinery to the 7-mer-9-mer RSS adjacent to the two 
segments to be recombined. Both DNA segments are then cleaved at 
the border of the two 7-mers (a reaction now known to be catalyzed 
by the RAG genes). The two 7-mer ends are joined together without 
modification, but the ends that will form the coding joint (which are 
now known to exist transiently in the form of a "hairpin" loop) are 
digested to varying extents by an exonuclease activity. Variable 
numbers of nucleotides may be added to the 3' ends through the 
action of terminal deoxynucleotide transferase (TdT). Then the 5' 
ends are filled in by a DNA polymerase and the resulting flush 
ends are li gated together, completing the recombination event. 

In Vitro Experiments 

to Investigate Substrates and Products 

Investigations of the recombination mechanism have been 
advanced by the development of methods for following these 
events in vitro. Some experiments have exploited the ability of the 
Abelson murine leukemia virus (AMuLV) to selectively transform 



pre-B cells without abolishing the active V gene assembly charac- 
teristic of this stage of lymphoid development. Several AMuLV 
lines have been cloned and then repeatedly subcloned in order to 
follow the progression of recombination events (187-189). Partic- 
ularly valuable information has been gained by transfecting 
AMuLV lines, as well as other lymphoid and nonlymphoid cells, 
with artificial gene constructs capable of undergoing V(D)J recom- 
bination. Several such constructs have been designed with selec- 
table markers whose expression depends on a recombination event. 
For example, Lewis et al. (190) used a retroviral construct in which 
a drug-resistance gene was placed between Jk and Vk sequences 
such that the gene could be expressed only after inversional VJ 
recombination. In another strategy, Lieber et al. (191) rransfected 
various cell tines with a plasmid containing an ampicillin resis- 
tance gene (Amp 1 ) plus a chloramphenicol resistance gene (Cam*) 
whose expression was blocked by a stop codon flanked by two 
V(D)J recombination signal sequences. In B cells, recombination 
between the two signal sequences deletes the stop codon, allowing 
expression of the Cam r gene. When extrachromosomal circles are 
recovered from the cells and rransfected into bacteria, the extent of 
recombination can be determined by the ratio of transformed Amr/ 
bacteria that are Cam r . Depending on the orientation of the signal 
sequences in the starting construct, the recombination products 
represent coding or signal joints. These joints can be recovered 
efficiently from the Cam r colonies for analysis; the sequences of 
these joints have all the characteristics of natural recombination 
products. One interesting outcome from such experiments was the 
discovery that pre-B and pre-T cells from SCID mice were capable 
of recombination to form signal joints but were markedly defective 
in their ability to join coding ends to form coding joints (192). 

Indeed, from recombined engineered substrates, certain nonstan- 
dard joints also have been recovered, which, although not con- 
tributing to physiologic V gene assembly, may reflect features of 
the recombination mechanism (193). These nonstandard joints can 
be understood by appreciating that there are three topologies in 
which DNA that has been cut twice — generating four ends — can be 
rejoined. If the four ends are coding(V), signal(V), signal(J), and 
coding(J), the three possibilities can be defined by considering the 
three different ends that may join to the coding(V) end (assuming 
that the remaining two ends must join to each other). The possibil- 
ities are as follows: 

1 . coding(V)-coding(J) plus signal(V)-signaI(J). This is the stan- 
dard reaction product in which the coding(V)-coding(J) prod- 
uct encodes the assembled V J gene and the signal(V)-signal(J) 
represents the signal joint 

2. coding(V)-signal(V) plus signal(J)-coding(J). These prod- 
ucts (open and shut joints) look like the starting DNAs but 
can be distinguished from them if nucleotides have been 
added or deleted at the junctions so that they no longer 
hybridize to oligonucleotide probes specific for the coding/ 
signal junction. 

3. coding(V)-signal(J) plus signal(V)-coding(J). These are hybrid 
joints, in which the signal ends have switched places. 

The fact that all these recombinations can occur readily in the 
transfected construct DNA — which contains little Ig gene sequence 
beyond the DNA immediately flanking the V and J — suggests that 
neither a specific chromosomal location nor extensive flanking 
sequences are necessary for the recombination. 

The critical characteristics of the signal sequences have been 
explored using transfected constructs carrying various mutations 
(194,195). These experiments have verified the importance of the 7- 



128 



/ Chapter 5 




GGTTTTTGT 
CCAAAAACA 



"CACTGTG TC" 
jTGACAC^G* 

P OH 



RAG1 + RAG2 



GGTTTTTGT- 
. CCAAAAACA* 




CTGTGCACAGTG ; 
ITGACACjGTGTCAC : 



-ACAAAAACC- 
. TGTTTTTGG_ 



TfC^ 



hairpin nicking? 
exonuc le as e? 



3*0 H 



Terminal 
deoxy nucleotide 
transferase , 



iGGTG 



CTAG* 



DNA polymerase? 
ligase IV, XRCC4? 



■ GGTGGATC* 
. CCACCTAG* 

N P 



FIG. 13. Model forV assembly recombinations. All V assembly recombination reactions (in Ig andTCR genes) may 
proceed by a common mechanism, illustrated here by D-J recombination. The RSS 7-mers are depicted in triangles, 
which is the conventionally used RSS graphic. Hairpin loops are created on coding ends dependent on the action of 
RAG1 and RAG2, 

This reaction also generates two signal ends, which are ligated together. In the example shown here, after the 
opening of the hairpin loops on the coding ends, the D coding sequence is nibbled by exonuclease, whereas the J 
coding sequence is spared and instead shows P nucleotide generation due to asymmetric hairpin cleavage. N-region 
addition is pictured in this example as occurring only on the D-region end, but in reality, exonuclease digestion and 
N nucleotide addition can occur on either (or both) coding ends. The steps in the proposed mechanism are discussed 
in the text. 
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mer/9-mer sequences and have shown that the spacer sequences are 
not critical as long as the spacer length (12 or 23 bp) is maintained. 
The 7-mer appears especially important, with the 3 bp closest to the 
coding sequence being most critical for signaling recombination. 
Recombination activity could be detected in a variety of pro-B and 
pre-B cells as well as pre-T lines, but was virtually absent in mature 
B and T cells and in cells of nonrymphoid lineages (191). 

P Regions and Hairpin Intermediates 

To study intermediates in V(D)J recombination, Roth et al. 
(196,197) designed a Southern blot strategy to detect double-strand 
breaks in the TCR 5 locus between D and J, and applied this strat- 
egy to DNA from newborn murine thymus cells, which actively 
rearrange the TCR 5 locus. Compared with DNA from adult liver, 
additional bands were found in the newborn thymus DNA, repre- 
senting DNA fragments extending from a D-derived 7-mer on one 
end to a J-derived 7-mer on the other end. In order to characterize 
the sequence of the signal ends in detail, several laboratories have 
used ligation-mediated PCR (LM-PCR). This technique involves 
ligating the blunt genomic signal ends with a double-stranded 
oligonucleotide, followed by amplification extending from a primer 
sequence near the genomic signal end to the added oligonucleotide 
sequence; amplified products can then be cloned and their sequence 
determined. LM-PCR analyses of both TCR and Ig genes have 
defined the signal ends as blunt-ended cuts, usually exactly at the 7- 
mer border, leaving 5' phosphate and 3' hydroxy 1 groups (198,199). 

In their original Southern blot assays, Roth et al. detected the 
signal ends from these cuts at levels representing about 2% of the 
thymus DNA; but the coding ends could not be visualized at all, 
perhaps because of rapid processing of these ends into coding 
joints. Based on the known defect of SCID lymphocytes in form- 
ing coding joints, Roth et al. (196) reasoned that SCID thymocytes 
might accumulate the cut coding ends that could not be visualized 
in normal thymocytes. Indeed, coding ends were detected in the 
SCID thymocyte DNA and, moreover, were found to have several 
properties suggestive of a hairpin-like structure. First, the coding 
ends in SCID thymocyte DNA were resistant to exonuclease treat- 
ment. Furthermore, restriction fragments bearing these in 
vivo-generated ends on one side were found to move on a denatur- 
ing electrophoresis gel as if they were twice as long as predicted 
from the size of the double-stranded fragment before denaturation. 
Finally, LM-PCR experiments failed to detect the coding ends 
unless they were pretreated with a single strand-specific endonu- 
clease, consistent with the impossibility of ligation to a hairpin 
unless it was first opened. The sequences of LM-PCR products 
obtained after endonuclease treatment suggested that the hairpins 
contained the entire sequence of the coding element, without loss 
or gain of a single nucleotide (200). 

One interpretation of these hairpin ends in the SCID DNA is 
that they represent normal V(D)J recombination intermediates 
that — in wild-type cells — are opened at variable positions within 
the hairpin loop by an endonuclease activity that is dependent on 
the normal allele of the SCID gene. P nucleotides could then 
result from opening the loop at an asymmetric position (Fig. 13); 
this model would explain the absence of P nucleotides from cod- 
ing ends that have been "nibbled" after opening of the hairpin. 
The unusually long P nucleotide segments observed in the rare 
coding joints assembled in SCID mice might then be interpreted 
as resulting from resolution of hairpins by nonspecific nicking 
enzymes that, unlike the exonuclease activity dependent on the 
normal allele of the SCID gene, do not focus on the hairpin loops 



but nick in variable positions in the double-stranded hairpin 
"stem" (196). 

In support of the notion that the hairpin coding ends are inter- 
mediates in normal VJ recombination, Ramsden and Gellert (201), 
using LM-PCR, were able to detect such ends in a normal (non- 
SCID) B-lymphoid line engineered to sustain a high level of K gene 
recombination. In this line, broken JkI signal ends were detectable 
in amounts corresponding to 30% to 40% of the K loci present. 
Coding ends were observed at 10- to 100- fold lower abundance, 
with both hairpin and open ends. The observed kinetics were con- 
sistent with simultaneous production of signal ends and hairpin 
coding ends, with rapid processing of hairpins to open coding ends 
and then to coding joints, but slower ligation of signal ends. This 
model is further supported by the observation that linear DNA mol- 
ecules with hairpins at both ends can, after transfection into B-cell 
lines, be recovered as recircularized molecules, with the frequent 
creation of P insertions (202). Interestingly, SCID B cells perform 
about as well as normal B cells in this assay, suggesting that the 
protein missing in SCID cells is not the hairpin nicking enzyme 
itself, but rather an activity that makes natural endogenous hairpin 
coding ends available to the enzyme; these endogenous ends may 
require such an activity because of their association with recombi- 
nase or other chromosomal proteins, whereas transfected hairpins 
free of attached proteins may be accessible independent of the pro- 
tein missing in SCID. The molecular basis of the SCID defect is 
considered in more detail below. 

In Vitro Cell-Free VD J Recombinase Activities 

In attempts to discover components of the VDJ recombination 
machinery, several groups have studied endonuclease activities 
from lymphoid sources that cleave DNA selectively near the 7-mer 
element (203-205); but these experiments have not led to any 
breakthroughs, and it presently seems unlikely that any of these 
activities represent components of the recombinase. 

RAG Genes 

A rather unlikely experiment has led to a major breakthrough: 
the identification of two genes whose products are apparently crit- 
ical for V(D)J recombination in B- and T-lineages. Schatz and Bal- 
timore (206) stably transfected fibroblasts with a construct con- 
taining a selectable marker whose expression was dependent on 
VDJ recombination; as expected, no measurable recombination 
occurred in this nonlymphoid cell. However, when either human or 
murine genomic DNA was transfected into these fibroblasts, a 
small fraction of recipient cells stably expressed recombinase 
activity, activating the selectable marker. This suggested that a sin- 
gle transfected genomic DNA fragment was able to confer recom- 
binase activity in a fibroblast After successive rounds of transfec- 
tion and selection for recombinase activity, the critical genomic 
fragment was identified. This fragment turned out to contain two 
genes, designated RAG-1 and RAG-2. These genes are not homol- 
ogous to each other, and neither is strikingly similar to any other 
known genes, although a weak relationship between RAG- 1 and a 
topoisomerase has been suggested. Both genes are required for 
activity; therefore, these genes would not have been discovered by 
this transfection technique if they had been situated too far apart in 
the genome for both to be transferred on a single DNA fragment. 
The genes are notable for having no introns in most species (cer- 
tain fish are exceptions) and for their close association and oppo- 
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site transcriptional orientation in all species examined. The close 
proximity of these two genes related by function but not by 
sequence has led to the speculation that they might have arisen 
from a more primitive viral or fungal recombination system (207), 
as discussed later in this chapter. 

A crucial role for the RAG genes in V assembly recombination 
is supported by the conservation of these genes in a variety of Ig- 
producing vertebrate species, from humans to sharks (208), 
whereas RAG homo logs have not been identified in any species 
that does not demonstrate Ig V gene assembly recombination. 
RAG-1 and RAG-2 are expressed together in pre-B and pre-T cells, 
specifically at the stages expressing V(D)J recombinase activity. 
Moreover, mouse strains in which either gene has been eliminated 
by homologous recombination (gene knockouts) have no mature B 
orT cells as the apparent result of abrogation of V(D)J recombina- 
tion (209,210). Recently a subset of human patients with a SCID 
syndrome and no B -lymphocytes were found to have mutations in 
RAG genes (211). 

Attempts to demonstrate activities of the RAG proteins on 
recombination substrates in vitro were hampered by poor solubility 
of the proteins, but functional analyses of mutated RAG genes — 
using RAG expression vectors cotransfected with recombination 
substrate plasmids into fibroblasts — showed that surprisingly large 
segments of both proteins can be deleted without eliminating 
recombinase activity (212); and some of the deleted proteins were 
soluble and could be handled relatively easily as fusion proteins. 
This work allowed the demonstration that in a cell-free in vitro sys- 
tem the two RAG proteins together can perform cleavage of sub- 
strate DNAs as well as hairpin formation on the coding end (213). 
The reaction occurs in two steps: first a nick occurs on one strand 
adjacent to the heptamer — the top strand as drawn in Fig. 13 — then 
the 3' hydroxy 1 created at the nick causes transesterification by 
nucleophilic attack on the phosphodiester bond adjacent to the 7- 
mer on the bottom strand (Fig. 13), yielding a hairpin on the coding 
end and a new 3' hydroxyl on the 3' end of the bottom 7-mer strand 
(214). This transesterification mechanism is consistent with the 
observation that the formation of the new phosphodiester bond in 
the hairpin occurs in the absence of external energy source such as 
ATP, apparently using the energy inherent in the phosphodiester 
bond broken in the nucleophilic attack. The stereochemistry 
observed in the reaction suggests that no phosphodiester linkage to 
protein occurs as an intermediate, such as occurs in bacteriophage 
lambda integration; instead the direct transesterification mechanism 
resembles the (x transposition and retroviral integration, which can 
both produce hairpins under certain experimental conditions (214). 

The actions of the RAG proteins were found to be dependent on 
divalent ions in the medium (215,216). In Mn 2+ the RAG proteins 
catalyzed cleavage of substrates with a single RSS; but in Mg 2+ 
cleavage required two RSSs and occurred most efficiently if the 
substrate conformed to the 12/23 rule regarding the spacing 
between 7-mer and 9-mer elements; thus, this rule may be a result 
of RAG protein specificity, although other proteins also seem to 
contribute to 12/23 specificity, perhaps by promoting an optimal 
molecular architecture (217). In Ca 2+ the RAG proteins and a radio- 
labeled DNA substrate containing an RSS formed a stable complex 
that was apparent in an electrophoretic mobility shift assay 
(EMSA) and was stable to competition with unlabeled substrate, 
but the substrate was not nicked or cleaved. However, when Mg 2+ 
was added to the stable complex, substrate cleavage occurred. 
(Interestingly, in the presence of Ca 2+ human immunodeficiency 
virus (HIV) integrase and \i transposase also form similar stable 
complexes in which substrate DNA is bound but no cleavage 



occurs.) The Ca 2+ -mediated binding of RAG proteins to substrate 
DNA was decreased 10- fold by the elimination of the 9-mer from 
the RSS, so the RAG proteins must recognize both components of 
the RSS. In contrast, mutations in the 7-mer that altered the 
nucleotides closest to the coding region (residues known to be crit- 
ical for supporting cleavage) had minimal effect on binding. This is 
consistent with other evidence (218,219), suggesting that these 
nucleotides, and the adjacent nucleotides in the coding region, may 
contribute to a local alteration in DNA helix structure that is impor- 
tant for the cleavage reactions. Recently, in vitro experiments have 
been reported in which RAG proteins supplemented with extracts 
from several cell lines were able to generate signal joints (220) and 
coding joints (221,222). In one system, RAG- 1 and RAG-2 could 
be detected in a stable complex containing two signal ends, an 
HMG (high-mobility group) protein, and perhaps other proteins as 
well (223). These in vitro VDJ recombination experiments should 
allow the elucidation of other components required for the reaction, 
as well as its mechanism. 

The double-strand DNA breaks catalyzed by the RAG proteins 
could be potentially deleterious if they occurred during DNA syn- 
thesis or mitosis, but this problem appears to be prevented by tight 
posttranscriptional regulation of RAG-2 protein levels across the 
cell cycle. Although the RAG-1 protein and mRNA transcripts of 
both RAG genes vary little across the cell cycle, a phosphorylation- 
dependent degradation signal mediates destruction of the RAG-2 
protein (224,225), thereby preventing double-strand DNA breaks 
in the H-chain JH locus from occurring during M, G2, and S (199). 
The phosphorylation site, a threonine at amino acid 490, falls into 
a region of the sequence that is highly conserved across species and 
contains a consensus sequence characteristic of targets of cyclin- 
dependent kinases. This regulatory region is dispensable for enzy- 
matic activity. In RAG-2 knockout mice carrying a transgenic 
RAG-2 gene with an alanine replacing the phosphorylatable threo- 
nine, RAG-2 protein and double-stranded DNA breaks were found 
throughout the cell cycle, demonstrating the importance of the 
RAG-2 degradation signal in cell cycle control of VDJ recombina- 
tion (226). 

Although all the binding and enzymatic activities of the RAG 
proteins discussed so far have required the presence of both RAG- 
1 and RAG-2, recent reports have suggested that RAG- 1 may bind 
weakly to the RSS 9-mer in the absence of RAG-2 and that this 
— binding may be-mediated by a segment of RAG-1 that bears 
sequence similarity to the DNA binding domain of bacterial inver- 
tases (227,228). Two circumstances have been described in which 
only one of the two RAG genes is expressed. RAG- 1 was reported 
to be expressed without RAG-2 at low levels in the developing cen- 
tral nervous system (229) (although the RAG-l-deficient mice 
show no obvious central nervous system defects). Conversely, 
RAG-2 is expressed without RAG-1 in the chicken bursa of Fabri- 
cius (230), which contains B-lineage cells at a developmental stage 
when their genes have already undergone V(D)J rearrangement and 
are in the process of being diversified by gene conversion (as dis- 
cussed later in this chapter). The significance of this finding is 
unclear at present because RAG-2 is apparently not essential for 
the gene conversion itself (231). 

Apart from the obvious importance of the RAG proteins in 
mediating the initial steps of VDJ recombination, knowledge of 
these proteins and. .their genes has allowed two major technical 
advances that have opened the way to many additional experi- 
ments. One such advance is the availability of the RAG-1 and 
RAG-2 knockout mice. These mice have no functional B cells or T 
cells, and are not "leaky" like SCID mice, which develop some 
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functional B and T cells, especially as the animals age. The RAG 
knockouts can be used to study the importance of the innate 
immune system (i.e., responses that occur in the absence of anti- 
gen-specific lymphocytes) in particular immune responses. The 
knockouts can be used as recipients for various lymphocyte subsets 
to explore the roles of different cell types. They can be used to 
study the signals for B cell development by introducing transgenes 
with specific functionally recombined Ig genes and characterizing 
the phenotypes of lymphocytes that develop. Finally, they can be 
used in RAG complementation experiments designed to assess the 
phenotype (in lymphocytes) of various other gene knockouts (232). 
In RAG complementation, embryonic stem (ES) cells in which the 
gene of interest has been knocked out by homologous recombina- 
tion are injected into homozygous RAG knockout blastocysts 
(RAG-/-); this procedure yields chimeric mice in which all B and 
T cells derive from the engineered ES cells, which are the only 
source of intact RAG genes to support lymphocyte development. 
Such animals can be made more easily than a knockout mouse line 
and can be used to study the effect of gene deletion in lymphocytes 
independent of effects the deletion may have in other cells. In par- 
ticular, for cases where the gene knockout causes embryonic lethal- 
ity due to effects on nonlymphoid cells, RAG complementation 
allows the selective knockout in lymphocytes to be studied. 

The second major technical fallout from the RAG genes is the 
method for investigating VDJ recombination in nonlymphoid cell 
lines with well-characterized mutations in genes governing DNA 
repair; when such lines are transfected with RAG genes, the effects 
of these gene mutations on VDJ recombination can be assessed. 

Components of Later Steps in VDJ Recombination 

Clearly the RAG genes are critical for the first steps in VDJ 
recombination (recognition of RSS, cleavage, and hairpin forma- 
tion), but additional components are required to complete the reac- 
tion; and at least some of these components may function not only in 
VDJ recombination but also in ubiquitous DNA repair pathways. 
The first clear example of such a component to be recognized was 
the murine SCID mutation described above. This mutation was orig- 
inally identified in a mouse strain that was immunodeficient as a 
result of a marked impairment in VDJ recombination of both Ig and 
TCR genes; SCID lymphocytes can perform the RAG-mediated 
reactions of cleavage and hairpin formation, and can form signal 
joints, but are markedly defective in coding joint formation. Subse- 
quently it was found that the SCID mutation also blocks the enzy- 
matic mechanism responsible for repair of double-strand DNA 
breaks — such as those caused by ionizing radiation — in both lym- 
phoid and nonlymphoid cells. This suggested that after RAG-medi- 
ated DNA cleavage, lymphocytes may complete the joining reactions 
using enzymes that function ubiquitously in DNA repair. To test this 
hypothesis, Taccioli et al. screened panels of Chinese hamster ovary 
(CHO) cell lines carrying well-characterized defects in DNA repair 
to see if these lines were impaired in performing VDJ recombination 
after transfection with the RAG genes (233). Such cells had previ- 
ously been classified into x-ray cross complementation (XRCC) 
groups by investigating the outcome when two mutant cell lines are 
combined to make a somatic hybrid. If such a hybrid shows no DNA 
repair defect, this implies that the two original cell lines carry differ- 
ent mutations such that the hybrid ends up with a normal copy of 
each gene (the mutant cell lines cross-complemented one another). 
Conversely, by definition, cells in the same cross-complementation 
group are unable to complement each other. Of eight XRCC groups 
of ionizing radiation-sensitive rodent cell lines, three were known to 



be defective in repair of double-strand breaks in DNA (XRCC 
groups 4, 5, and 7), and all three of these groups were found to be 
impaired in VDJ recombination after RAG transfection. 

The genes mutated in XRCC 5 and XRCC 7 turned out to encode 
two components of a three-polypeptide complex known as the Ku 
complex. Originally characterized as the autoantigen recognized by 
a patient antiserum, Ku is composed of an approximately 70-kDa 
protein (Ku70) and an approximately 86-kDa protein (Ku86, often 
called Ku80) which, when heterodimerized, can bind to DNA (234). 
Then the DNA-Ku heterodimer complex can recruit the third com- 
ponent: an approximately 450-kDa protein with a protein kinase 
activity that is dependent on binding to DNA (234,234a). Some evi- 
dence suggests that DNA-PK can bind DNA even in the absence of 
Ku (234b). The conservation of Ku genes in drosophila and yeast 
suggest that the complex evolved long before VDJ rearrangement. 
Extensive evidence indicates that the gene defective in XRCC 5 
encodes Ku80 (235,236), whereas the gene defective in XRCC 7 
(which corresponds to the gene mutated in murine SCID) encodes 
the 450-kDa DNA-PK (237). Although the murine SCID mutation 
impairs primarily coding joints, this difference is probably a result 
of residual DNA-PK protein in the SCID cells because a more com- 
plete equine DNA-PK mutation impairs both coding and signal 
joints (238). Ku80 mutant cell lines are also defective in both signal 
and coding joint formation, as are mice with a knockout of the Ku80 
gene (200). Ku70 mutants have not been detected in panels of exist- 
ing XRCC mutants, but recent evidence indicates that cells with 
homozygous disruption of Ku70 are also defective in VDJ recom- 
bination induced by RAG gene transfection (239). 

The Ku complex had previously been studied as an activity with 
an unusual DNA binding specificity: rather than recognizing par- 
ticular nucleotide sequences in DNA, it recognizes particular topo- 
logic features of DNA, including double-stranded DNA ends (such 
as might be generated by double-strand breaks caused by x-rays or 
by recombinases). Once bound to an end, it can translocate down 
the length of the DNA (240). The Ku heterodimer also has been 
reported to have an ATP-dependent helicase (DNA-unwinding) 
activity (241). Several models can be envisioned for the role of this 
complex in VDJ recombination: the complex may bind to the hairpin 
coding ends and regulate hairpin opening and DNA degradation by 
exonucleases; it may participate in destabilizing the DNA double- 
helix through helicase action; and it may influence recombination 
by phosphorylating other -proteins via the protein kinase activity of 
DNA-PK. Additional investigation will be necessary to clarify 
which (if any) of these roles is important for VDJ recombination. 

The gene mutated in XRCC group 4 also has been cloned (242). 
It encodes a ubiquitously expressed protein of about 38 kDa pre- 
dicted size that is not homologous to any known protein. The pro- 
tein has recently been found to bind to and activate DNA ligase IV, 
suggesting that this enzyme is probably important for ligating sig- 
nal and coding joints in VDJ recombination (243,244). In addition, 
the XRCC4 protein product interacts with DNA-PK and is phos- 
phorylated by this kinase (244a). 

Other Proteins that May Participate 
in V(D)J Recombination 

The RAG proteins, the Ku-DNA-PK complex, and the XRCC 4 
protein have all been shown to play a role in VDJ recombination 
because mutations that compromise these proteins impair the recom- 
bination. Various other entities have been proposed as participants in 
VDJ recombination; some of these are described below, although it 
is not clear that any of them participate in VDJ recombination. 
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Several laboratories have sought components of the recombinase 
machinery by searching for proteins that (a) are present in nuclear 
extracts from cells early in the B-lineage in which VDJ recombina- 
tion is occurring and (b) bind in vitro to the 7-mer or 9-mer signal 
sequences in a sequence-specific manner. Although this would seem 
a reasonable approach, it is clear that binding to RSS sequences does 
not necessarily imply a physiologic role in the recombinase reaction. 
The RAG proteins, which clearly bind to RSS sequences, seem both 
necessary and sufficient to initiate VDJ recombination, so it is pos- 
sible that no further RSS-specific components are necessary for the 
recombinase. An instructive cautionary example is provided by the 
case of the RSS-binding protein for Jk (RBP-Jk). This protein was 
purified from a murine pre-B cell line on the basis of sequence-spe- 
cific binding in vitro to a Jk 7-mer, was found in lymphoid cell lines 
but not in nonlymphoid lines, and bears sequence homology to bac- 
terial integrases, which reinforced its possible role in a DNA recom- 
bination; but more recent studies suggest a function for RBP-Jk that 
is unrelated to VDJ recombination (245). 

With such caveats in mind, several other candidate participants in 
VDJ recombination can be mentioned. One protein capable of bind- 
ing to a probe containing the 7-mer signal element was detected by 
EMSA in several pre-B cell lines, but not in myeloma, mature T-cell; 
monocyte, or fibroblast cell lines, consistent with a role for the pro- 
tein in recombination-competent cells (246). Another RSS-binding 
protein has been identified by a technique known as Southwestern 
analysis: a protein extract is subjected to SDS-polyacrylamide gel 
electrophoresis, blotting onto nitrocellulose, and probing with a 
radioactive oligonucleotide including the RSS. By this method a 
1 15-kDa protein was detected in extracts of immature B- and T-cell 
lines, but not in a myeloma line; the binding of this protein seemed 
markedly reduced when mutations were introduced into either the 7- 
mer or the 9-mer in the probe (247). Another protein, designated 
T160, was obtained as a cDNA clone from a protein expression 
library that was screened with an RSS probe having a 12-bp spacer 
(248). The T160 gene product, expressed as a fusion protein, was 
able to bind to the original screening probe but not to a probe with a 
mutated 7-mer or to a probe with a 23 -bp spacer Another protein has 
been identified as binding to the 9-mer RSS element but not to sev- 
eral mutated versions of the 9-mer (249). Designated NBP (nonamer 
binding protein), this 63-kDa protein was purified approximately 
20,000-fold from calf thymus. A possibly related protein designated 
VDJP was identified from a lymphoid cDNA expression library by 
screening with a Jk RSS probe (250). The resulting full-length 
cDNA represents a lymphoid-specific alternative splice form of the 
ubiquitous replication factor C (RF-C) mRNA; both sequences con- 
tain a region homologous to bacterial ligases. In vitro-expressed 
VDJP protein catalyzes a DNA joining reaction dependent on an 
RSS sequence in the DNA fragments that are joined (251). However, 
the substrates and products of this joining reaction differ in many 
critical respects from Ig genes, so the relevance of this protein to 
VDJ recombination is uncertain. A protein designated recognition 
component (Rc) is encoded by a cDNA that was isolated from a 
cDNA expression library from mouse thymocyte RNA using a radi- 
olabeled RSS as a probe (252). The in vitro-expressed protein binds 
DNA as multimers, suggesting a possible role in bringing together 
elements to be joined by VDJ recombinase machinery (253). A 30- 
kDa protein recognizing both 7-mer and 9-mer of the RSS was 
detected in immature thymocytes enriched for pre-T cells undergo- 
ing VDJ rearrangement (254). A possibly related cDNA clone with 
homology to DNA helicases, designated lymphoid-specific helicase 
(lsh), was cloned from fetal thymus (255). 



Exonuclease 

Many recombined V regions are found to be missing variable 
numbers of nucleotides at the recombination junctions compared 
with the coding sequences present in their respective germline V, D, 
or J precursors. This observation has been proposed to result from 
exonuclease-induced nibbling of the cut DNA ends during the time 
between cleavage near the 7-mer RSS and rejoining of the cut DNA 
ends. Although the responsible exonuclease has been sought (256), 
and several exonuc leases are known to exist in mammalian cells, 
the specific enzyme that nibbles the ends of V, D, and J segments 
has not been definitively identified. 

N Regions and TdT 

Terminal deoxynucleotide transferase, the proposed source of N- 
region additions, is an enzyme found in thymus and bone marrow, 
and also is a distinguishing characteristic of lymphoid versus 
myeloid leukemias. It catalyzes the addition of nucleotides onto the 
3' end of DNA strands. Although no template specificity deter- 
mines the nucleotides added, the enzyme adds dG residues prefer- 
entially This fact is consistent with a role for this enzyme in the 
origin of N regions found at the V-D and D-J junctions because 
these N nucleotides tend to be G-rich at the 3' ends of both the 
upstream coding strand and the downstream noncoding strand. 
Both N-region addition and TdT are characteristically absent from 
fetal lymphocytes (257). N-region addition is common in H-chain 
genes but rare in murine L-chain genes, although perhaps less rare 
in humans (258). 

The proposal that N regions result from the action of TdT has 
received considerable support. Lymphocytes with engineered 
defects in their TdT genes produced rearranged Ig V regions with 
almost no N additions (259,260). Conversely, when TdT expression 
was engineered in cells undergoing K or X L-chain rearrangement, 
the normally low level of N-region insertion in these recombina- 
tions was dramatically increased (261-263). This result suggests 
that the low frequency of N-region sequences in normal k or X 
recombinations is not due to the inability of these coding 
sequences to accept N-region nucleotides. Instead, the preferential 
occurrence of N regions in H-chain versus L-chain genes (in mice, 
at least) reflects TdT levels that are higher in early B-lineage cells 
undergoing H-chain rearrangement than in the later stage of L- 
chain recombination; indeed, mice with an engineered mutation 
that allows premature Vk-Jk joining in pro-B cells show an 
increased frequency of N-region nucleotides in their recombined 
Vk genes (264). In normal mice the expression of a |X H chain may 
downregulate TdT expression (265), contributing to the reduced 
level during the stage of L-chain recombination. 

N regions are also observed in TCR genes, in which they may be 
particularly significant as a source of sequence diversity in view of 
the lower germline V diversity and absence of somatic mutation in 
the TCR gene systems. Although N regions clearly enhance the 
diversity of Ig and TCR V regions, mice lacking TdT show no sig- 
nificant deficiencies in immune responses (266). The normal phe- 
notype of such mice (apart from the absence of N regions) and the 
lymphoid-specific expression of TdT in normal mice both support 
the view that the only function of this enzyme is to diversify V- 
region genes. In TdT mutant mice, as well as in normal fetal lym- 
phocytes low in TdT activity, absence of N-region addition is asso- 
ciated with an increase in the frequency of recombination junctions 
in which short stretches of nucleotides could have derived from 
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either germline element because of an overlap of identical 
sequences at the coding ends. These junctions suggest a recombi- 
nation intermediate in which the complementary single-stranded 
regions from the two coding ends hybridize to each other, much as 
"sticky ends" generated by restriction endonucleases can facilitate 
ligation of DNA fragments. Such homology-mediated recombina- 
tion may restrict the diversity of neonatal antibodies. The resulting 
antibodies possibly are enriched in specificities for commonly 
encountered pathogens, or have broadened specificity, as has been 
reported for TCRs lacking N regions (267). 

Regulation of V(D)J Recombination 

The recombination events that occur among Ig genes must be 
among the most important events that mark the development of a 
B-cell clone. Regions of the genome are irrevocably deleted, and 
commitments are made as to which L-chain isotype and which VL- 
VH pair will be expressed in subsequent progeny cells. It would be 
expected that such significant events would be well regulated. 
Indeed, the observation that each B-cell line generally expresses 
only one L-chain isotype (isotype exclusion) and uses only one of 
the two homologous chromosomal loci for H- and L-chain genes 
(allelic exclusion) implies some form of regulation. Isotype and 
allelic exclusion ensure that each lymphocyte expresses a single 



H2L2 combination and thus a single antigen-binding specificity, a 
crucial feature of the clonal selection model of the immune 
response. Furthermore, if the same recombinase machinery medi- 
ates the V-gene assembly reactions of all the Ig and TCR gene sys- 
tems, then some mechanism must regulate which gene systems are 
susceptible to recombination in B-cell versus T-cell development. 
Current evidence suggests that VDJ recombination is controlled at 
two levels: regulation of the RAG protein levels and regulation of 
accessibility of the recombinase machinery to the germline sub- 
strates of rearrangement. Because RAG expression and locus 
accessibility are in turn regulated depending on the stage of B-cell 
development, a brief scheme of this development is presented 
below as background; a detailed account is provided in Chapter 6. 

B-Lymphocyte Development 

Figure 14 illustrates a scheme of B-lymphoid development as 
elucidated by the following: 

1. Analysis of lymphoid malignancies or virally immortalized 
cells representing different stages of arrested development 

2. Purification of subpopulations of normal cells from lymphoid 
organs by fluorescence-activated cell sorting (FACS), followed 
by analyses of different subsets 
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FIG. 14. Ig gene recombination in B-cell development. A simplified scheme of B-cell development is presented as a 
background for discussion of Ig gene recombination. The stages occurring in the bone marrow versus in the periph- 
ery (e.g., lymph nodes, spleen) are shown, along with the status of IgH and IgL genes at each stage. A graphic image 
depicts the Ig-related proteins displayed on the surface at each stage; at the bottom, the stage-dependent expres- 
sion of RAG genes and TdT — both important in V(D)J recombination — is schematically depicted, as is the expres- 
sion of several other marker proteins. 
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3. Phenotypic analysis of mutant mice with defects in various 
genes critical for progression from one developmental stage to 
the next 

4. Culture of B-cell precursors in vitro using systems that allow 
developmental progression. 

This figure attempts a consensus of schemes from two principal 
laboratories (268,269). Some known differences between human 
and mouse B-cell development and surface markers (270) are not 
reflected in this simplified summary figure. 

B- and T-lymphocytes differentiate from pluripotent hematopoi- 
etic stem cells in the fetal liver and bone marrow. The primordial 
lymphoid progenitor has the potential to differentiate into B- or T- 
lymphocytes or natural killer (NK) cells. Among the earliest mark- 
ers that indicate B-lineage specificity are the non-Ig components of 
the pre-B cell receptor (pre-BCR): Iga, Igp and X5 (271). CD19, 
which functions as a coreceptor in signal transduction, first appears 
in large proliferating pro-B cells, which also express several other 
distinguishing surface markers, including c-kit (receptor for the 
stem cell [growth] factor [SCF]), B220 (a B-lineage form of the 
phosphatase CD45), ToX and CD43 (a sialoglycoprotein known as 
leukosialin). In the absence of H-chain protein, the SLC is dis- 
played on the surface membrane in association with a complex of 
glycoproteins, represented by a hook shape in Fig. 14, which has 
sometimes been called a surrogate H chain (272). The next stage, 
the pre-B cell, is marked by expression of the RAG genes and H- 
chain rearrangement, as well as loss of c-kit and then CD43 expres- 
sion. Initially DJ rearrangements occur on both chromosomal loci, 
producing the preB-I cell; then germline V regions join to complete 
a VDJ gene (preB-II). The resulting u\ protein appears on the B-cell 
surface along with the SLCs in a pre-BCR or u>SLC complex that 
also includes Iga and Igp. The resulting large pre-B cells prolifer- 
ate, with RAG gene expression downregulated. Then the cells 
become smaller, stop dividing, turn up RAG gene expression once 
more, undergo L-chain rearrangement, and express surface IgM 
(immature B cells). When they eventually also express surface IgD 
they become mature B cells and migrate into the periphery, ready 
to be triggered by antigen exposure. 

Recombinational Accessibility and Transcription 

What maintains the locus specificity of VDJ rearrangement — i.e., 
why is Ig gene recombination confined to B cells, with H-chain- 
re arrangement before L-chain rearrangement, and why is TCR gene 
recombination exclusive to T cells? One possible clue is the obser- 
vation that susceptibility to recombination seems to be correlated 
with transcriptional activity of germline gene elements. For example, 
Reth and Alt (273) reported that AMuLV-transformed pre-B cell 
li nes — representing a developmental stage capable of rearranging 
VH to DJH — synthesize an RNA transcript that includes DJ and Cu\ 
sequence (termed Du\ RNA). Furthermore, many germline VH genes 
are transcribed at the pre-B cell stage, just at the time when these 
genes are targets for recombination (274); these transcripts — desig- 
nated sterile transcripts because they do not encode a functional Ig 
chain — are not seen in more mature B cells in which H-chain recom- 
bination has been terminated (275). Similar sterile transcripts have 
been reported for other germline Ig gene elements during the period 
when they are actively rearranging. Susceptibility of a segment of 
DNA to both transcription and recombination might be a reflection 
of a common chromosomal state (accessibility) required for both 
reactions, or transcription itself might be a prerequisite for recombi- 
nation, perhaps by partially unwinding the DNA. Interestingly, a 
similar correlation between transcription and recombination has 



been reported for the isotype switch of H-chain genes, discussed 
above, and recombination of yeast mating-type genes. 

To further explore the relationship of transcription to recombina- 
tion, several groups have deleted enhancer regions known to stimu- 
late transcription of the murine K or H-chain locus and found that 
recombination of the corresponding locus was substantially reduced. 
In the K locus, one enhancer (iEic) is located in the intron between Jk 
and Ck, and a second enhancer (3'Ek) is located about 9 kb down- 
stream from Ck. When homologous recombination in ES cells was 
used to replace Ek by a neomycin resistance gene (neo 1 ), homozy- 
gous mutant mice were found to have no k gene rearrangements 
(276). It should be noted that another report suggests that replace- 
ment by neo* may impair K gene recombination more effectively 
than simple deletion (277). Compared with effects of Ek elimina- 
tion, deletion of the 3'Ek caused a more modest reduction in K gene 
rearrangement (278). Similar conclusions on the importance of gene 
enhancers in supporting recombination have been obtained with 
transgenic miniloci capable of V(D)J recombination (279) and with 
similar constructs stably integrated into cell lines competent for 
V(D)J recombination (280). However, the relationship between tran- 
scription and recombination is not simple. One report suggests that 
in transgenic constructs the 3'k enhancer actually downregulates 
recombination (264). And in mouse strains transgenic for another k 
rninilocus, linkage to a rabbit iEK substantially increased recombin- 
ation even though this enhancer is inactive in upregulating transcrip- 
tion in mouse cells (281). Finally, two elements known as KI and 
KII, which are located just upstream from JkI and have no known 
enhancer function, appear to be important for Vk-Jk recombination 
because such recombination was substantially inhibited in B cells 
containing disruptions in these elements, at least under certain condi- 
tions (282). Clearly, further investigation will be necessary to clarify 
how gene recombination is controlled by transcription and chromo- 
somal changes. The further question of how these parameters are 
themselves regulated is addressed below (with respect to feedback 
regulation by Ig proteins) and in more detail later in this chapter. t 

Allelic Exclusion Models 

Two general models have been proposed to explain allelic exclu- 
sion and isotype exclusion. The stochastic model interprets the 
observed high frequency of defective rearranged genes as a conse- 
quence of the rarity of functional rearrangements; allelic exclusion 
would follow from the low probability of the coincident occurrence 
of two rare events in the same cell (283). In this model, the low fre- 
quency of A,-producing cells in the mouse would be a stochastic 
consequence of the smaller repertoire of XV regions available to 
rearrange. An alternative to the stochastic model, which might be 
called the regulated model, was first proposed by Alt and col- 
leagues (284) and has received considerable experimental support. 
According to this model, the functional rearrangement of an L (or 
H)-chain gene in a particular B cell would inhibit further L (or H)- 
chain gene rearrangement in the same cell. If the inhibition 
occurred promptly after the first functional rearrangement, then 
two functional Igs could never be produced in the same cell. An ini- 
tial nonproductive rearrangement would have no inhibitory effect, 
so recombination could continue until a functional product resulted 
or until the cell used up all its germline precursors. 

Model for Regulated Recombination 

As a framework for discussing this model in more detail, Fig. 15 
illustrates four hypothetical regulatory influences that could be 
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components of an allelic exclusion control mechanism. As shown 
in the upper left corner of the figure, the first Ig gene rearrange- 
ments that occur in a B-lineage cells join D to JH segments. The 
resulting DJ junctions are commonly seen on both chromosomes in 
early B-lineage cells from fetal liver or bone marrow that have been 
transformed by AMuLY as well as in pro-B cells isolated from 
normal bone marrow by flow cytometry (285). Analyses of cells at 
this early stage consistently show k and A. genes in germhne con- 
figuration. The next recombination step is V->DJ. The expected 
frequency of VDJ joints maintaining the proper triplet reading 
frame from V to J is about one third, so most VDJ junctions will be 
nonfunctional. In addition, some rearranged H-chain genes may be 
nonfunctional despite in-frame VDJ junctions (286) owing to 
defects in the germline VH sequence. According to the model, if 
the initial VDJ junction is nonfunctional for any reason, then in the 
absence of a viable \i protein, H-chain gene recombination can 
continue on the other chromosome. If the VDJ recombination on 
the second chromosome is also nonfunctional, then the cell may 
have reached a dead end, leading to death by apoptosis (gray shape 
in Fig. 1 5). The apoptotic fate of such nonproductive cells has been 
supported by the observation that mice transgenic for the apoptosis 
suppression gene bcl-x L harbor an expanded population of bone 
marrow pro-B cells with almost all nonproductive VDJ joints (9). 
Although secondary rearrangement may rescue some cells with 
two nonfunctional recombinations, it is not clear how frequently 
such secondary events occur. 

In contrast, if the first V-»DJ recombination in a pro-B cell pro- 
duces a functional VDJ gene, then its expression will lead to the 
synthesis of |X H chain. This H chain is expressed in the surface of 
pre-B cells as a pre-BCR in association with the SLCs VpreB and 
X5 (as discussed above) and the same Igct-Igp heterodimer that, in 
mature B cells, transmits into the cell the activation signal initiated 
by antigen-induced cross-linking of surface IgM (see Chapter 7). 
This pre-BCR complex has two regulatory effects: it blocks further 
H-chain recombination (CD in Fig. 15) and it activates k gene 
rearrangement (d> in Fig. 15). The ^-induced block to further H- 



chain rearrangement was initially hypothesized based on the static 
analysis of myelomas (i.e., as an explanation for observed allelic 
exclusion), but more recently it has been directly supported by 
experimental manipulation of Ig genes. When a functionally 
rearranged u. transgene was inserted into the genome of a mouse 
strain, it was found to markedly suppress the rearrangement of 
endogenous H-chain genes in B-lymphocytes (287), suggesting 
that the protein product of the u. transgene, expressed in pre-B 
cells, could shut off V(D)J recombination of germline elements in 
the endogenous IgH locus. A transgene encoding the membrane 
form of u- (M was competent to suppress endogenous VDJ recom- 
bination, but one encoding only the secreted form (^) was not 
(288-290), demonstrating that a membrane form of protein is 
required to mediate allelic exclusion. This implication is supported 
by the observation that allelic exclusion is lost in mouse strains 
that— due to gene targeting— cannot express the membrane exon 
(291) or functional X5 protein (292) that is necessary for surface Ig 
expression; in these animals, individual B cells may carry two pro- 
ductive V(D)J junctions because any \i protein resulting from an 
initial recombination on one allele cannot assemble on the mem- 
brane as a pre-BCR to shut off V->DJ rearrangement of the other 
allele. The signal for suppressing VDJ recombination appears to be 
mediated by the Igct-Igp heterodimer; a transgene in which crit- 
ical residues mediating association of |l H chain with this het- 
erodimer were mutated did not suppress endogenous VDJ recom- 
bination, but when this transgene was engineered to express the 
cytoplasmic domain of Iga or IgP, the resulting chimeric trans- 
genes were able to shut off endogenous VDJ recombination 
(293,294). The normal pre-BCR-induced shut-off may be medi- 
ated in part by downregulation of RAG gene expression (295). This 
view would be consistent with the levels of RAG gene expression 
detected in murine bone marrow cells sorted by flow cytometry 
into populations representing different stages in B-lymphocyte 
development: RAG- 1 and RAG-2 mRNAs were detectable in pro- 
B and early pre-B cells (corresponding to cells undergoing D-»J 
and V-*DJ recombination), but undetectable in the large prolifer- 
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FIG. 15. Regulation of V assembly recombination. 
Allelic and isotypic exclusion can be explained by the 
four regulatory effects diagrammed here. Negative reg- 
ulatory effects {thick black lines) prevent a second H- 
or L-chain recombination event from occurring after an 
earlier event leads to a functional protein product. Pos- 
itive regulatory effects (thick gray lines terminating in 
+) switch on k recombination only after a functional 
protein is produced, and switch on X recombination 
only after nonproductive k recombination has occurred 
on both chromosomes {shaded oval area). The latter 
effect has not been demonstrated unequivocally; alter- 
natively, a functional \l protein may activate recombi- 
nation in both k and X loci. 
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ating preB-U cells expressing u-SLC. RAG gene expression then 
becomes detectable again in the small prcB-II cells undergoing L- 
chain V-»J recombination. In addition to effects on RAG activity, 
the pre-BCR may also downregulate further VDJ recombination by 
reducing accessibility of the H-chain locus, as indicated by reduced 
sterile VH gene transcription (296) and by reduced ability of RAG 
proteins to produce broken signal ends in nuclei incubated in vitro, 
as determined by LMPCR (297). Diminished accessibility of the 
IgH locus would prevent further V-»DJ recombination during the 
subsequent stage when RAG proteins are upregulated to activate L- 
chain VJ recombination. 

Interestingly, the signals mediated by the pre-BCR (u-SLC) and 
the mature BCR (IgM) are critical not only for regulating VDJ 
recombination but also as checkpoints controlling other features of 
B-lymphocyte differentiation. Thus, in bone marrows of RAG 
knockout (298) or JH knockout mice (299), B-lymphopoiesis 
appears to be blocked at the earliest pro-B stage: (a) large cells 
stain positive for B220, CD43, and c-kit, and (b) cells with surface 
markers typical of mature B cells are absent from the periphery. 
When a recombined VDJ-Cu. H-chain transgene was introduced 
into a RAG-1-/- or RAG-2-/- background (300,301), the resulting 
H protein allowed the progression of B-lineage cells to the stage of 
small preB-II cells, where L-chain recombination would normally 
occur. These cells could not undergo VL->JL recombination in the 
absence of RAG proteins but did show upregulation of sterile k 
transcription, an apparent reflection of regulatory effect © in Fig. 
15. If, in addition to the u\ gene, a complete recombined L-chain 
transgene was also added to the genome of the RAG knockout 
mice, then B cell development appeared to be restored, with nor- 
mal numbers of B cells in the periphery, expressing mature B-cell 
surface markers and secreting antibody encoded by the transgenes. 
(An L-chain transgene alone was ineffective in "rescuing" B-cell 
differentiation.) The developmental block in RAG knockouts is 
similar to that seen in knockouts, which lack the SLC compo- 
nent of the pre-BCR. These mice are also arrested in pre-B cell 
maturation because of the absence of pre-BCR signaling. A similar 
immunodeficiency syndrome has recently been reported in humans 
with homozygous defects in the human homolog of A.5 (301a). 
Interestingly, the X5 knockout mice could be rescued from their 
developmental arrest by a recombined K transgene that was 
expressed in pre-B cells, indicating that a k chain can substitute for 
the SLC in mediating maturation signals (302); indeed, even with- 
out the K transgene, some maturation occurs in the 76 knockout 
mice, a presumed result of small amounts of Vk-Jk recombination 
occurring before VDJH recombination and thus providing a k 
chain that allows surface IgM expression and signaling. The per- 
missive effect of the pre-BCR on developmental progression 
appears to be mediated by the Iga-IgP heterodimer, based on 
results with the mutant or chimeric \i transgenes linked to Iga or 
Igp cytoplasmic domains, as described above (293,294). 

The hypothesis that u, protein can activate K-chain recombina- 
tion (effect ® in Fig. 15) was originally derived from static com- 
parisons 'of various B-lymphoid cell lines: cells in which only u 
genes are rearranged and expressed are common (pre-B ceil lines), 
but K-expressing cells without H-chain gene rearrangement and 
expression are rare, as though H-chain expression were a prerequi- 
■ site for k expression. This view has been supported by observation 
of AMuLV-transformed lines (187) and normal B-cell precursors 
cultured in vitro: these lines always rearrange H-chain genes before 
K genes. In a more direct demonstration that \i protein could stim- 
ulate K rearrangement, an AMuLV-transformed cell that could not 
express endogenous H chain (because of defective VDJ rearrange- 



ments on both of its H-chain loci) was found to retain the K locus 
stably in germline configuration until a functional *i gene was 
introduced; this u. gene activated k gene rearrangement and expres- 
sion (189,303). Only a gene encoding was effective in activating 
k rearrangement and not a yi s gene, again suggesting a requirement 
of surface pre-BCR expression for this signaling. A similar con- 
clusion was deduced from the rescue of RAG knockout animals by 
a recombined VDJ-Cu transgene, which caused upregulation of 
sterile k transcription in preB-II cells, as described above. Also, a 
human \i transgene was found to upregulate both k gene (sterile) 
transcription and Vk-Jk rearrangement in B-lymphoid precursors 
in fetal liver (296). To directly assay for accessibility of Ig loci to 
RAG proteins, Constantinescu et al. (297) incubated RAG-/- nuclei 
from different B-cell developmental stages with RAG proteins in 
vitro and detected broken signal ends by LMPCR; using this 
method, they found that introduction of a VDJ-C|i transgene into 
the RAG-/- background caused a 30-fold increase in the frequency 
of breaks at JkI observed in pre-B cell nuclei incubated with RAG 
proteins. Interestingly, a 5 transgene is also apparently able to both 
inhibit endogenous VDJ rearrangement and activate K rearrange- 
ment (304), so these effects must be mediated by properties com- 
mon to 8 and M- H chains. Crosslinking of the pre-BCR complex is 
apparently required for activation of K recombination (163), sug- 
gesting that a ligand that occurs physiologically in the environment 
of the pre-B cell may bind to the pre-BCR to signal activation of 
gene rearrangement in the K locus and suppression in the H-chain 
locus; however, no candidate ligand has been identified. 

Despite all the evidence cited above, it is clear that some sterile 
K gene transcription Vk-Jk recombination can occur in the 
absence of a ^-containing pre-BCR (29 1 ,305-307). This may be 
a consequence of some "leakiness" of the controls on L-chain 
recombination in early B-lymphopoiesis or may reflect a separate 
developmental lineage in which Vk-Jk recombination is activated 
earlier; but the low frequency of this premature Vk-Jk recom- 
bination would keep the possibility of L-chain double-producers 
violating allelic exclusion below 1% (299). 

When k recombination begins, the possibilities for functional 
and nonfunctional VJ rearrangements resemble those discussed 
above for the H chain. According to the regulatory model, if a cell 
initially rearranges its k locus on one chromosome nonproduc- 
tively, then it can proceed to rearrange the locus on the homologous 
chromosome. As.soon as functional k gene rearrangement leads to 
expression of a functional K chain that can associate with u. to form 
a surface-expressed IgM molecule (i.e., a mature BCR), then fur- 
ther K rearrangement will be suppressed (effect <D in Fig. 15). This 
regulatory influence would explain the observation of allelic exclu- 
sion in K-expressing myelomas, and it has been supported by the 
finding that a functional rearranged VJ-Ck transgene can suppress 
rearrangement of endogenous K genes (308). Furthermore, in 
murine B-lymphoma lines expressing RAG proteins, cross-linking 
of surface IgM with an anti-u antibody was found to rapidly sup- 
press RAG gene expression (309). A ligand might, deliver a corre- 
sponding signal in physiologic circumstances, but this is not clear, 
especially because under some conditions cross-linking the BCR 
of pre-B cells (as might occur on binding of a self-antigen) can 
actually upregulate RAG gene expression to activate "receptor edit- 
ing/' A BCR-mediated signal seems to be required for small preB- 
II cells to advance to the immature B phenotype and move into the 
periphery because, as- mentioned above, RAG knockout animals 
rescued with only a |i transgene were not able to advance beyond the 
pre-B stage, whereas a combination of u, and K transgenes allowed 
normal B-cell proliferation, surface marker expression, and migra- 
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lion to the periphery. Iga is also apparently critical for mediating 
some differentiation signals of the BCR because mice with an en- 
gineered deletion in the cytoplasmic signaling domain of Iga had 
only 1% of the normal numbers of circulating B cells (310). 

Regulatory effect ® in Fig. 1 5 is rather speculative because lit- 
tle is known about regulation of A. gene rearrangement. The idea 
that X recombination is somehow triggered by nonfunctional k 
rearrangements on both chromosomes derives from the observa- 
tions that most B cells show isotypic exclusion (i.e., express either 
k or X but not both) and that k rearrangement seems to occur before 
X. Thus, in studies of normal and malignant human B-lymphoid 
cells (136,31 1), in K-expressing cells, X genes were found to be in 
germline configuration, whereas in X-expressing cells, k genes 
were either rearranged (rarely) or deleted (most commonly). The K 
deletions reflect the RS recombination event discussed earlier in 
this chapter. These results suggest that X genes remain unre- 
arranged until K genes rearrange nonproductively or are deleted. 
The mechanism of this apparent regulation of the X genes by the K 
locus is unknown. It has been suggested that somewhere in the 24 
kb between Ck and the RS site lies a sequence that suppresses X 
rearrangement but is deleted in the K RS recombination event to 
alleviate this suppression. However, in contrast to this hypothesis 
are the observations that mice with a targeted deletion of either Ck 
or the intronic k enhancer have almost normal numbers of B cells, 
essentially all of which express A, despite having no loss of DNA 
between C K and Rs (276,306,312). 

It is assumed that membrane expression of a jx-X IgM would shut 
off all further L-chain gene recombination by a similar mechanism 
to that in the K locus (as illustrated by effect <D in Fig. 1 5), a sup- 
position supported by suppression of K gene expression in X trans- 
genic mice (313). However, this suppression is somewhat leaky, 
especially in older mice (314-3 1 7), and even in normal splenocytes 
a small population of cells expresses both isotypes (3 1 8). These 
observations have led to the speculation that certain B-Iympho- 
cytes are not programmed for strict isotype exclusion. It is also 
possible that in some cells expressing both k and X, one of the iso- 
type L chains has such a greater affinity for the expressed H-chain 
protein (on the basis of VH-VL compatibility) that the other iso- 
type does not functionally contribute to surface Ig and is thus allel- 
ically excluded at the protein level. Some evidence in opposition to 
the strictly ordered (k before X) rearrangement model for L chains 
has been put forward (285,319), suggesting that VJ recombination 
is activated concurrently in both the K and X loci. In this model the 
preponderance of K-expressing lymphocytes (in mice at least) 
would result from a stochastic process in which K rearrangements 
are favored by the larger Vk repertoire and other features of the K 
locus, such as more active recombination signal sequences (320). 
Alternatively, the preponderance of K expression may be explained 
by a model in which k and X rearrangement occurs independently, 
but k recombination initiates earlier in B-cell development (321). 
In either of the latter two models of L-chain rearrangement, a sur- 
face IgM molecule (H2K2 or H2X2) would signal feedback suppres- 
sion of L-chain recombination, so that both allelic exclusion and 
isotype exclusion would be explained by the same mechanism (and 
the regulatory effect ® of Fig. 15 would not exist). 

RAG Protein Production 
after Mature BCR Expression 

Although the RAG genes are apparently downregulated through 
a signal mediated by the appearance of IgM on an immature B cell, 
there is evidence that RAG gene expression occurs in at least two 



later stages of B-cell development: during receptor editing of 
autoreactive B cells in the bone marrow and during B-cell matura- 
tion in germinal centers. 

An early observation suggesting the possibility of receptor edit- 
ing by secondary rearrangement of k L-chain genes came from an 
analysis of circular DNAs representing the deleted segment in VJ 
recombination (322). It was observed that, in addition to contain- 
ing the expected signal joints, many of these circles contained VJ 
junctions; these could have formed in an initial inversional recom- 
bination, which was then followed by a secondary deletional 
rearrangement that produced the observed circular DNA. Signifi- 
cantly, about a third of the VJ junctions analyzed showed no appar- 
ent defect, indicating that they could have produced a functional 
antibody that was altered by secondary recombination. Since this 
study, several laboratories have directly observed secondary k 
rearrangements in B-cell tumors and AMuLV-transformed pre-B 
lines (188,323). H-chain V-gene replacement also can occur, 
mediated by a 7-mer embedded in the 3 ' end of many VH coding 
regions, as described in an earlier section. As in the case of k 
genes, such replacement can occur even after a productive VDJ 
recombination (180,324). A potential reason for replacing produc- 
tively rearranged L or H chains would be to abort production of an 
antibody that was autoreactive. Thus, receptor editing might com- 
plement two other mechanisms for preventing autoantibodies: 
anergization and cell deletion by apoptosis. 

Several studies have supported this interpretation using mice car- 
rying transgenes expressing autoreactive antibodies (325,326). In 
one study, the JH locus was targeted for replacement by homolo- 
gous recombination with the 3H9 recombined VDJ gene; this gene 
encodes an H chain that in combination with most (but not all) k L 
chains can bind to DNA, a self antigen (181). In such mice, most B 
cells have replaced the 3H9 gene by an upstream VH gene, with 
junctions showing typical N regions and exonuclease nibbling. 
When inserted as a normal transgene, 3H9 also stimulates L-chain 
editing, as evidenced by increased frequency of Jk5 usage and 
reduced diversity of Vk genes expressed by the B cells displaying 
the 3H9 H chain (325). These results are consistent with the inter- 
pretation that primary rearrangements, yielding Vk proteins capable 
of supporting DNA binding, were edited by secondary rearrange- 
ments involving downstream Jks and Vk regions incompatible with 
DNA binding. Receptor editing appears to occur in the immature B- 
cell population in the bone marrow (327,328) and is associated with 
increased RAG gene expression. Indeed, BCR cross-linking of a 
human B-cell line was found to upregulate RAG gene expression 
(329). As previously discussed, BCR cross-linking also has been 
reported to terminate RAG gene expression of surface IgM + imma- 
ture B cells to mediate allelic exclusion (309). How a cell can dis- 
criminate between a BCR-mediated signal that downregulates RAG 
expression to mediate allelic exclusion and a BCR-mediated signal 
that upregulates RAG expression to initiate receptor editing is cur- 
rently not understood; but differences in receptor affinity, the pre- 
cise stage of development, or costimulatory signals might be 
involved. Interestingly, a failure of receptor editing may contribute 
to the autoantibodies mediating systemic lupus erythematosis (330). 

A second instance of late RAG expression occurs in germinal 
center (GC) B cells (331,332). RAG-1 and RAG-2 mRNA tran- 
scripts were detected by RT-PCR in FACS-purified GC cells from 
immunized mice, the RAG proteins were detected in GC cells by 
immunofluorescence, and evidence of ongoing V(D)J recombina- 
tion was found in GC cells (332a,332b). RAG expression also was 
observed constitutively in Peyer's patch GCs (which are maintained 
by food antigens in the absence of intentional immunization) and 
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in splenic B cells cultured with IL-4 plus LPS (conditions known 
to induce at least one other process typical of GC cells, i.e., isotype 
switching). GC cells appear to recapitulate expression of several 
surface markers characteristic of early B-lineage cells, including 
heat stable antigen (CD24) and X5; so it appears that the RAG gene 
expression may be just one aspect of a GC-induced reversion to a 
primitive phenotype. What function might RAG proteins have in 
the GC? The primary processes that affect Ig expression in the GC 
are somatic mutation and isotype switching, but it is unlikely that 
RAG proteins are expressed in GC to function in either of these 
processes because both can occur in B .cells of RAG-/- mice 
(1 16a,331). One possibility is that RAG-dependent receptor editing 
may be turned on to replace V regions that have become autoreac- 
tive as a result of GC-induced somatic mutations; further experi- 
ments will be necessary to evaluate this interpretation. 

GENERATION OF DIVERSITY 

One of the most interesting questions about Igs is the source of 
the immense variation observed in antibody-binding specificities. 
As discussed at the beginning of this chapter, early speculative 
debates about this question centered on the relative contributions of 
germline repertoire and somatic mutation in creating diversity. One 
source of diversity that was unanticipated before the recombinant 
DNA revolution already has been discussed in this chapter in some 
detail: somatic recombinational dtversity. We will now focus on the 
germline repertoire and then consider the contribution of somatic 
mutation. 

Germline Diversity 

A comprehensive evaluation of the germline repertoire of V- 
gene segments requires an examination of the sequences of all 
germline V regions, a daunting task. However, modern molecular 
biology techniques — including cloning vectors allowing long 
genomic inserts and large-scale sequencing with fluorescent dyes 
and automated sample preparation — have helped realize this goal 
for the human K, X, and H-chain loci; and considerable progress 
has been made with the murine k and H-chain loci. (The tiny V 
repertoire of the murine X loci has already been discussed in the 
section on X genes.) 

Two Worldwide Web resources are devoted to providing conveni- 
ent updated access to Ig germline gene sequences. The IMGT (inter- 
national ImMunoGeneTics) data base (http://imgt.cnusc.fh8 104/ 
home.html), coordinated by Marie-Paule Lefranc, includes a data 
base for Ig and TCR sequences, as well as a separate one for major 
histocompatibility complex (MHC) sequences (333). In the Ig/TCR 
database, all species for which data are available are included; 
sequences are annotated in standard formats, and map information is 
provided graphically V Base Gold (http://www.mrc-cpe.cam.ac.uk/ 
imt-dcK^public/I>nTl0.htrnJ) is an online catalog of human V gene 
segments and alleles coordinated by Ian M. Tomlinson. 

Germline Diversity of the Murine IgH Locus 

Attempts to analyze the mouse VH repertoire began before the 
gene cloning era with the study of VH amino acid sequences from 
mouse myelomas. Initial attempts to classify the observed VHs into 
related groups were based on limited amino acid sequence analy- 
sis, primarily of N-termini of myeloma proteins. The current 
scheme classifies two V gene sequences into the same group or 
family if they show more than about 80% nucleotide sequence 
identity, and into different families if their sequences are less than 



70% identical. (Empirically, few VH comparisons yield identities 
between 70% and 80%.) These criteria for sequence similarity cor 
respond well with the degree of similarity that allows hybridizatior 
between a V probe and members of the same family under condi- 
tions of moderate stringency. The initial classifications based or 
this scheme identified seven VH families (334,335). Since tha 
time, continuing analysis of new sequences has identified eigh 
additional families (336-338). The families now known contribut< 
to the bulk of the immune response: when 2,000 cDNA clone: 
hybridizing to both Qt and JH probes were analyzed, all had \ 
regions from the 1 5 families, based on hybridization or sequenc< 
analysis, except for about 2% of the clones representing truncate* 
or aberrant cDNA synthesis (338). The families have been furthe 
classified into three groups, or clans, based on sequence conserva 
tion in the framework I region (FR1; codons 6-24) and FR; 
(codons 67-85) (339-341). (Framework amino acids are the non 
CDR parts of the Ig V region that hold the CDR loops in positioi 
to contact antigens.) The clans are conserved between humans 
mice, and frogs, suggesting that several fundamental steps ii 
germline VH diversification preceded the amphibian-reptile diver 
gence (342). 

The classification of VH genes into families leads* to twi 
approachable questions: (a) How many genes in each family ar 
available to contribute to Ig diversity? (b) How are the familie 
arranged on the chromosome? One straightforward approach to th 
question of gene number is to count the number of bands visible oi 
Southern blots. However, this method can yield only a rough esti 
mate of gene number because of several complications in the inter 
pretation. The number of bands may underestimate the number o 
VH regions for two main reasons: (a) a given DNA fragment ma 
contain more than one V-related sequence; and (b) some observe 
bands may actually represent several comigrating hybridizing ON/ 
fragments, each containing different VH genes. On the other ham 
the number of bands could theoretically exceed the number of dii 
ferent VH genes for two reasons. First, some hybridizing sequence 
may not contribute effectively to sequence diversity. In particula 
some nonallelic gene pairs may be so similar that the second cop 
provides no gain in amino acid sequence diversity. Other 1 
sequences are nonfunctional because they have become separate 
from the C-region locus, even lying on a different chromosomt 
Still other germline clones isolated on the basis of hybridization t 
aj/ probe have turned out to contain multiple defects that woul 
preclude their expression as functional V regions even if the 
underwent rearrangement (i.e., they are pseudogenes); conceivabl 
these pseudogenes could contribute to diversity at the somatic lev< 
through gene conversion, a mechanism known to operate in chid 
ens and rabbits as discussed later. Second, some bands could b 
counted twice because of hybridization to probes of two difFerei 
families. This could occur if a specific DNA fragment carrie 
germline V sequences from two different families or if hybridize 
tion occurred across group boundaries owing to a clustering < 
residues identical to the probe sequence. 

With these caveats, Table 1, modified from a compilation t 
Kofler et al. (343), is presented to give an idea of the wjdely van 
ing complexity of the different VH groups. Several groups ha\ 
only a few members. For example the VH SI 07 family yields foi 
Southern blot bands, and extensive cloning with a probe for th 
family has in fact detected only four germline members (of whic 
one is apparently a pseudogene). At the other extreme, the VH J55 
family shows much greater Southern blot complexity and may coi 
tain as many as 1 ,000 members in the BALB/c mouse. This est 
mate (344) was based on quantitative kinetics of hybridizatic 
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using an excess of single-stranded J558 probe, whereas the lower 
estimate of 60 genes is based on counting Southern blot bands. If 
the larger estimate is accurate, then many of the bands observed on 
Southern blots with the VH-J558 probe must contain multiple 
comi grating DNA fragments; these would probably represent 
recent duplications in the VH locus and would be expected to 
encode minimally diverged VH sequences. Other strains of mice 
besides BALB/c seem to have smaller J558 families (345), consis- 
tent with the notion of a recent expansion of J558 VH genes in 
BALB/c. 

The question of how the germline VH genes are arrayed on the 
chromosome has been approached by several different tech- 
niques. One straightforward method has been to screen phage 
libraries of germline DNA with VH probes and to examine clones 
containing more than one VH region. Application of this method 
to the mouse VH locus has yielded three important generaliza- 
tions: adjacent VH genes are usually members of the same fam- 
ily; they are oriented in the same 5 '-3' direction; and they are 
spaced about 7 to 15 kb apart (346-348). The first finding sug- 
gests that members of a given VH family are clustered together 
on the chromosome. Such clustering represents a simplifying 
principle that allows the mapping of murine VH genes to be con- 
ceptually divided into establishing the order of the family clusters 
on the chromosome and then establishing the order of VH genes 
within each cluster. 

One approach to ordering the VH families has been to examine 
(by Southern blotting) the VH bands that are deleted in various 
myelomas or hybridomas. Deductions about the order of VH fami- 
lies can be made if the nonexpressed chromosome is deleted so that 
all the VH fragments observed on a Southern blot can be considered 
to derive from the same (i.e., the expressed) chromosome and to He 
upstream from the rearranged VH gene (349-351). A powerful vari- 
ant of the deletion method has used a panel of Abelson virus-trans- 
formed B-cell lines constructed from Fl animals heterozygous for 
allotype at the IgH locus. In most of these cell lines it was possible 
to distinguish deletions on the two parental chromosomes and estab- 
lish an independent VH gene order for each (338,352). 



Another mapping approach is brute force chromosome walking 
by the generation of large overlapping clones using cosmid 
libraries; this approach is being pursued by a number of laborato- 
ries, but is made difficult by the occurrence of recent duplications 
leaving nearly identical DNA segments that cannot easily be dis- 
tinguished or ordered on a map. This problem should hopefully be 
resolved by the use of yeast artificial chromosome (YAC) clones, 
which can accommodate 1 - to 2-Mb segments of genomic DNA. 
YACs have been useful in long-range mapping of the human V loci. 
Pulsed field gel electrophoresis also has been used to separate large 
fragments of genomic DNA for mapping by Southern blotting. 

Studies using these techniques have been in general agreement 
about the order of certain murine VH families, but a complete map 
consistent with all data from the various mapping methods is not 
currently available. It is clear from several laboratories that some 
interdigitation between families occurs (351,353), and this could 
contribute to difficulties in interpretation. A representative map of 
15 murine VH families is shown in Fig. 16, based on the work of 
Brodeur and colleagues (352,354). 

Among VH maps based on different techniques, the best agree- 
ment is on the families closest to Cu,. The map order of these fam- 
ilies is: S 1 07 — Q52 — 7 1 83 — D — J — Qi, with some overlap 
between these three families as shown in the Fig. 16. This order is 
of special interest because the most proximal family cluster (desig- 
nated 7183), and in particular its most proximal member (desig- 
nated VH81X), is the V region that is significantly over-repre- 
sented in the VDJ rearrangements occurring in fetal liver pre-B 
cells (355). This observation was earlier taken as evidence favoring 
a tracking model of V gene rearrangement [i.e., a recombinase 
would engage DNA near the J regions and slide 5 ' to find V regions 
to recombine (356)]; but alternative interpretations have been pro- 
posed based on more recent data (339,340,357,358). 

Mouse Germline DH and JH Regions 

D regions were initially hypothesized based on the highly 
diverse amino acid sequences in myeloma proteins between the V 



TABLE 1. VH region families of mice and humans 



Mouse 



Human 



Family number 


Family name 


Complexity' 


VH2 


Q52 


15 


VH3 


36-60 


5-8 


VH8 


3609 


7-10 


VH12 


CH27 


1 


VH1 


J558 


6-1000 


VH9 


VGAM3-8 


5-7 


VH14 


SM7 


3-4 


VH15 


VH15 


2 


VH4 


X-24 


2 


VH5 


7183 


12 


VH6 


J606 


10-12 


VH7 


S107 


3 


VH11 


CP3 


1-6 


VH13 


3609N 


1 



Clan 6 



Group c 



Family number 



Complexity* 



III 



III 



VH2 
VH4 
VH6 

VH1 
VH7 
VH7 



VH3 



^Complexity, an estimate of the number of VH sequences in each family. 
''Clans of VH sequences as defined by Schroeder et al. (340) 
groups, based on the classification by Tutter and Riblet (339). 

^Complexity, based on the prototypic haplotype provided by Cook and tomlinson (381). 



4 
9 
1 

14 

5 
5 



46 



This table is based on a compilation of murine VH regions by Kofler et al. (343), and a review of human VH reqions 
by Pascual and Capra (822) and a paper by Mainville et al. (338). Original references for the data can be found in 
inose sources. 
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and J regions, as briefly discussed earlier in this chapter. Because 
both VH and JH were known to be flanked by signal elements of 
the long space type (23 bp), it was predicted that a germline D 
region would be flanked on both sides by short signal element 
spaces so that both V-D and D-J recombination would conform to 
the 12/23 spacer rule (359). 

Finding a germline D gene with a probe corresponding to D- 
region sequence from one of the cloned recombined genes was 
technically difficult because the DNA segment encoding the few 
amino acids of the D region would be too short to give a usable 
hybridization signal. To obtain more effective probes, DNA frag- 
ments have been isolated from DJH intermediates that have not yet 
recombined with a VH gene, and thus retain the 5' flanking 
sequences from the germline D providing a longer probe (360). A 
DJH intermediate cloned from the myeloma QUPC52 identified its 
germline D precursor— designated DQ52-0.7 kb 5' to JH1. Its 
structure was very similar to expectation: a 10-nucleotide coding 
sequence flanked on both sides by RSS elements with a 12-bp 
spacing. A similar clone derived from a rearranged DJH in aT-cell 
line (SP2) was used as a second probe to clone nine related D 
regions clustered within a 60-kb region, all having 17-nucleotide 
coding segments and short spacing of the signal elements 
(361,362). A third D probe called FL16, derived similarly, identi- 
fied the FL16 family, which is composed of only two germline D 
genes but is well represented in the rearranged IgH-chain genes 
that have been sequenced. Finally, a last D region, D ST 4, was iden- 
tified through the recognition of a recurring nucleotide sequence 
observed between V and J in recombined VDJ regions that was not 
accounted for by the previously known D sequences (363). The 13 
murine D regions span about 80 kb upstream from the four JH seg- 
ments that in turn lie upstream from Cu. 

Apart from the additional combinatorial diversity contributed by 
the repertoire of germline D elements, the flexibility of the recom- 
bination site applies on both ends of the D region. Furthermore, an 
out-of-frame recombination at the VD junction may be compen- 
sated by the frame of the DJ junction so that a particular D element 
could theoretically be read in all three frames in different VDJ 
recombinants. As mentioned previously, this extra source of diver- 



sity is used by human H chains (11), but the murine system has 
evolved mechanisms that strongly favor the reading frame known 
as RFl (10). DJ rearrangement in RF3 is counterselected owing to 
frequent internal stop codons. When DJ recombination has 
occurred in RF2, the resulting transcripts can encode a DJ-Cu pro- 
tein (designated the Du, protein), which can be expressed on the 
surface of a pre-B cell in association with the products of the 
VpreB and X5 genes (364-366). Murine cells expressing Du\ pro- 
tein cannot progress to normal Ig production, perhaps because the 
Du. protein triggers the shut-off of VDJ recombination before V 
assembly is complete; therefore, expressed H-chain V regions 
rarely include a D J junction in RF2 (10). This curious model is sup- 
ported by the observations that RF2 suppression is not observed in 
X5 knockout mice (which fail to express Du. protein on the cell sur- 
face) (367) and that analysis of recombination in single cells by 
PCR failed to detect cells containing both a DJ junction in RF2 as 
well as a productive VDJ junction (368). In humans this mecha- 
nism is not operative because ATG initiation codons are not gener- 
ally present 5' from D regions to encode a Du. protein. Some 
rearranged VDJ sequences seem to be interpretable as V-D-D-J 
products, even in cases where D-D recombination would seem to 
violate the 12/23 rule (369). ' " r ' v " 

Murine Germline Vk Locus 

Although murine L-chain genes were among the first Ig genes 
studied by molecular biology techniques, the organization of the 
mouse germline Vk locus has been less thoroughly characterized 
than the human. On the basis of N-terminal amino acid sequence 
data, Potter and colleagues classified mouse Vk sequences into 24 
groups (370). Current classification based on the nucleotide 
sequence criteria described above recognizes about 20 families 
(371). However, different classification schemes have yielded dif- 
ferent estimates, perhaps because the Vk genes show degrees ol 
relatedness that are not discrete steps, but rather continuous grada- 
tions (372), as would be expected if gene duplications could occuj 
on a time continuum and rates of sequence diversification couk 
also vary. As described above for VH genes, some Vk families an 
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FIG 16 MaDS of the murine and human VH loci. The 15 known murine VH gene families are shown in their approx- 
imate map positions. Each rectangle represents a cluster of VH genes of the .indicated family; the clan identiftcat.on 
(340) of the VH families is indicated by the color of the rectangle: black for clan I, gray for clan II, and white or clan 
II Althouqh some interdigitation is shown by overlapping families (e.g., the Q52 and 7183 families) the families are 
laraelv clustered In contrast, all human VH genes (vertical fines) of a prototypic haplotype are shown in the right 
panel based on the formulation by Cook and Tomlinson (381); extensive interdigitation of families is apparent. 
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shared by humans and mice, suggesting that the family divisions 
preceded primate-rodent species divergence (373,374). The murine 
locus, including about 140 Vk sequences (genes and pseudogenes), 
has been cloned on a series of overlapping bacterial artificial chro- 
mosome (BAG) and YAC clones, and spans about 3.5 Mb upstream 
from the Ck gene on chromosome 6 (375-377). In addition, a few 
Vk sequences have been localized to other chromosomes (chromo- 
some 16 and 19), where they could not contribute to diversity and 
are thus considered "orphons." In the functional Vk locus on chro- 
mosome 6, many related Vk sequences are found to lie clustered 
together, although some interspersion of families also exists. As of 
this writing, the total number of genes (versus pseudogenes) has not 
been completely determined by sequence analysis. 

Human Germline VH Locus 

Amino acid sequences of human myeloma VH-region proteins 
were originally classified by Kabat into three groups. These have 
been found to correlate well with three families defined by sorting 
together VH probes that produce similar Southern blot patterns. 
Four additional human VH families have been found more recently 
using molecular genetics approaches. Several of the human VH 
families show sequence similarities to particular mouse VH fami- 
lies (378), and all can be classified within the three large clans of 
murine VH genes common to the mouse, human, and Xenopus; as 
noted above, these observations suggest that significant germline 
VH diversification antedated the amphibian-reptile species diver- 
gence. 

Early phage and cosmid clones of human VH genes demon- 
strated that the human VH families are extensively interdigitated, in 
contrast to the family clusters characteristic of the murine locus. 
This interdigitated structure was confirmed by gene mapping stud- 
ies involving analysis of VH deletion in B-cell lines and of PFGE- 
based mapping data (354,379); ultimately the complete delineation 
of the human VH region has been achieved through analysis of 
overlapping YAC clones covering the entire locus (380-382). The 
VH locus spans 1 . 1 Mb at the telomeric end of chromosome 14 and 
includes 95 VH sequences; of these, 51 are functional and most of 
the remainder appear to be pseudogenes (although the exact num- 
bers are somewhat variable depending on the haplotype, and a few 
V sequences have not yet been fully characterized). Each VH 
region is identified by a two-digit number: the first number corre- 
sponds to the VH family and the second indicates the sequential 
number of the VH on the standard map (starting from the JH prox- 
imal end), with the letter P appended for pseudogenes. A particular 
VH locus is judged to be functional if it has no obvious defects in 
coding sequence and has been detected in a rearranged VDJ gene, 
indicating intact recombination signals. Additional polymorphic V 
regions are designated (using a decimal point) with reference to the 
JH-proximal standard VH; e.g. a polymorphic V region from the 
VH7 family lying between 4-4 and 2-5 is identified as 7-4.1. The 
entire VH locus thus extends from V 6-1 , which is located about 77 
kb 5' from JH1 (383), through V 7-81, which appears to be located 
within a few kb of the telomeric repeat sequences marking the end 
of the q arm of chromosome 14. Twenty-four additional germline 
VH sequences have been mapped to chromosomes 15 and 16 and 
represent nonfunctional orphons that were apparently duplicated 
from the functional locus on chromosome 14 (384,385); these 
sequences contributed to earlier overestimates of the length of the 
functional human VH locus. All the VH regions whose transcrip- 
tional orientations have been determined share the same orientation 
characteristic of the JH regions, consistent with recombination by 



deletion rather than inversion. Several regions in the locus show 
evidence of ancient duplications: segments in which a pattern of 
hybridization to different probes from nearby DNA regions is 
repeated elsewhere in the locus. Thus, the complete locus map may 
offer clues to its evolution, as well as defining the repertoire of 
germline VH diversity available to the immune system. 

Human JH and DH Regions 

Upstream of the human C\i gene lies a set of JH-region genes, 
including six apparently functional JH regions (386). Interspersed 
among the active human JH genes are three J pseudogenes that 
encode amino acid sequences never found in human H chains and 
that lack the RNA splice signal found at the 3' end of all active JH 
genes. All of the JH genes and pseudogenes demonstrate 23-bp 
RSS spacing (as in the mouse). 

Complete sequence analysis of a 92-kb region spanning the 
human D regions (11) has confirmed the general structure of the 
locus previously deduced from partial sequence analysis and South- 
ern blotting. One germline D gene is located in a position roughly 
homologous to that of the mouse DQ52, that is, 5' to the human JHI. 
This human D gene, initially designated DHQ52, bears striking^ 
homology to its murine counterpart but is the only human D seg- 
ment showing such human/mouse homology. All of the other human 
D regions fall into six families and lie in a cluster of duplicated 
domains about 22 kb upstream from JHI. There are 27 D regions; 
24 of these are accounted for by four tandem approximate duplica- 
tions of a 9.5-kb segment containing a representative of the six D 
families. In addition to these 24 D regions, three more D regions 
result from (a) an additional partial duplication of 2.8 kb, including 
one D; (b) an internal duplication creating one D; and (c) DHQ52, 
which is in a family of its own, distinct from the six duplicated fam- 
ilies. The D regions have been renamed following a scheme similar 
to that used for the VH genes: a first number identifies the family, 
and a second identifies the sequential position in the locus. The 
locus starts with the 5'-most D region, Dl-1, and ends with D7-27 
(DHQ52). Three D regions are apparently nonfunctional as a result 
of mutations in RSS 7-mers, and there are two pairs of D regions 
with identical coding sequences (including one of the D segments 
with a 7-mer mutation); so there are 23 distinct D regions that can 
contribute to human I g diversity. A comprehensive computer analy- 
sis of a data base of published human VH sequences showed that all 
of these sequences appear in the data base, many in all three read- 
ing frames. In general, one reading frame encodes primarily 
hydrophilic residues, one encodes hydrophobic residues, and one 
includes frequent stop codons. (Some D regions that contain stop 
codons can be used if these codons are removed by nuclease trim- 
ming before VDJ assembly is complete.) 

In addition to these D regions, all flanked by signal elements 
with the typical 12-bp spacing, one putative family, designated DIR 
(D with irregular spacing), has been described, having RSS ele- 
ments that could be taken with either 12- or 23-bp spacing; theo- 
retically, DIR regions could contribute extra diversity m the form 
of V-D-DIR-J or V-DIR-D-J rearrangements without violating the 
12/23 rule (387). However, the systematic evaluation of the data 
base of 893 published VH regions failed to detect expression of 
DIR regions (11) confirming the failure to detect such rearrange- 
ments using a sensitive PCR assay (388). Also absent from this VH 
data base were inverted D regions and the previously hypothesized 
D-D rearrangements (which would violate the 12/23 rule), 
although evidence that both of these can occur at low frequency has 
been obtained using highly sensitive PCR techniques (388,389). 
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Furthermore, DIR rearrangements, both direct and inverted, were 
found in mice transgenic for a human IgH minilocus (389a). Addi- 
tional human D segments originally thought to lie upstream from 
the main cluster apparently lie on the duplicated orphon cluster on 
chromosome 15 and arc thus nonfunctional (385,390). 

Human Germline Vk Locus 

The human Vk locus is located on the short arm of chromosome 
2 (2p 1 1 -2). Most of its genes fall into four known V gene families: 
VkI, VkII, VkIII, and VkIV Five cloned Vk sequences have been 
described that would fall into three additional families (VkV 
through VkVII), but these sequences have apparently not con- 
tributed to known proteins and thus are probably pseudogenes. 
Zachau and colleagues have performed an extensive investigation 
of the human locus by cloning, PFGE, and sequence analysis 
(391,392). They have identified 76 Vk sequences in the human Vk 
locus, lying in two 0.4-Mb contigs separated by a spacer of about 
0.8 Mb that is apparently devoid of Vk sequences. The Jk distal 
(upstream) segment, including 36 Vk sequences, appears to be the 
result of a large duplication. Within each duplicated segment, all V 
regions have the same 5 '-3' orientation. Remarkably, the segment 
distal to the Jk-Ck region lies in inverted orientation with respect 
to the proximal segment and Jk-Ck. Most of the duplicated Vk 
sequences could be assigned to the proximal or distal segment by 
preparative separation of the two loci using PFGE, followed by 
Southern blots exploiting the rare differences in the restriction 
maps of the duplicated segments. Alternatively, assignments could 
be deduced from Southern blot bands absent from the DNA of rare 
individuals lacking the distal duplication. In all B-lymphoid cell 
lines examined, those with rearrangements involving the distal 
inverted Vk segments contained retained signal joints and failed to 
show deletions of downstream Vk and 5' J-flanking DNA, consis- 
tent with Vk-Jk recombination by inversion. Except for two inser- 
tion/deletion differences leading to one unpaired proximal Vk and 
one unpaired distal Vk, the Vk sequences of the proximal and dis- 
tal parts of the locus match their homologs with 95% to 99% 
sequence identity. This high degree of similarity suggests a recent 
origin for the duplication, which is supported by the fact that the 
duplication is not found in chimpanzees or gorillas (393), which 
are thought to have diverged from the human lineage only 6 to 8 
million years ago. Between the proximal duplication and the Jk 
regions lie an additional six unpaired Vk sequences, of which the 
two nearest Jk lie in inverted orientation. The most J-proximal Vk 
sequence, the single gene of the VkTV family, is only 23 kb 
upstream from JkI. Members of the VkI, VkII, and VkIII family 
have been found to be extensively interspersed. Of the 76 Vk 
sequences in the locus, 33 are without apparent defect, although 
some in the duplicated segments are so similar to their duplicated 
counterpart that they do not contribute significantly to diversity of 
the locus, and some may not be expressed, for unknown reasons. In 
an examination of 70 cDNAs from a human spleen library plus 1 70 
cDNAs from the literature, only 21 of the Vk genes plus five from 
duplicated identical genes were found to be expressed, for an 
expressed cDNA repertoire of 27 Vk genes (394). Of the remain- 
ing Vk sequences in the germline Vk locus, 25 are unequivocal 
pseudogenes, demonstrating several crippling defects; in addition, 
16 sequences have one or two minor defects and might be func- 
tional in some haplotypes. 

Apart from the Vk sequences in the cluster near the Jk-Ck locus, 
Zachau and colleagues have identified at least 25 orphorts. One 
orphon cluster is located in the long arm of chromosome 2; perhaps 



it was separated from the major locus— on the short arm of thi 
chromosome — by a pericentric inversion [which must hav 
occurred rather recently in evolution because it is absent fror 
chimpanzees and gorillas (395)]. Other orphons are located o 
chromosomes 1 and 22; and at least one probably nonfunctional V 
lies about 1.5 Mb downstream from Ck (396). 

Human Germline VX Locus 

For many years the human VX system was the least characterize 
of the V loci of human and mouse, but the relative obscurity of thi 
locus has been dramatically reversed by recent intensive cloninj 
sequencing, and mapping of VA, regions (168,397) and ultimately b 
the complete sequence analysis of 1,025,415 bp covering the entii 
locus (149). The locus contains about 36 potentially functional V 
genes (in 10 families), 33 pseudogenes, and 34 relics, containin 
VX-like sequences severely disrupted by insertions or deletions. (A 
noted for other loci, exact numbers may differ depending on th. 
haplotype and method of analysis.) Of the potentially function; 
genes, only about 30 have been documented to be expressed t 
comparison with cDNA sequences. Within the clustered V 
sequences lies the human VpreB gene, as well as several genes an 
pseudogenes unrelated to the k system. All the Vk sequences are i 
the same transcriptional orientation as the J-C cluster. Analysis ( 
the 1 -Mb sequence shows several segments of internal duplication 
some including Vk regions. The largest and most frequent 
expressed Vk gene families lie relatively close to the J-C cluste 
mostly within the proximal 400 kb. Interestingly, these families ai 
most similar in sequence to the VA. genes of species that expre: 
predominantly this isotype of L chain, including chicken, horse, ar 
sheep, whereas the VX genes of the BALB/c mouse are most sim 
lar to the least frequently expressed human families. 

Combinatorial Diversity Estimates 

Before the era of recombinant DNA technology, the source * 
antibody diversity was so mysterious that it was whimsical 
referred to as the problem of generation of diversity (GOD 
Knowledge of antibody genes gained over the past 20 years has eh 
cidated the diversity inherent in the germline V repertoire plus ti 
diversity contributed by recombinational mechanisms (combinat 
rial multiplication, flexibility of recombination site, N and 
nucleotides), as already discussed. Together these diversity el 
ments provide an immense potential repertoire, one so large that 
some investigators it seemed unnecessary to postulate that diversi 
was further increased by somatic mutation. As an exercise in esi 
mating the contribution of germline and recombinational diversi 
in the human, consider the number of different antibodies th 
could be formed assuming 39 functional VH genes, 27 Vk gene 
and 30 VX genes. For K sequences, we can multiply 27 (Vk gene 
x 5 (Jk regions) x 2 (a conservative multiplier reflecting variabili 
around residue 96 resulting from flexible recombination), yieldii 
the product 270. For k sequences, we can multiply 30 (Vk genes) 
4 (]k regions) x 2 (flexibility multiplier), yielding the product 48 
Thus, the total VL possibilities are 270 + 480 = 750. For V 
sequences we have 39 functional germline genes x 23 (DH se 
ments) x 4 (JH) x 4 (flexibility multiplier on both sides of the 
segment) x 3 (possible reading frames of the D region) - 43,05 
Assuming random association of L and H chains to form a cor 
plete L2H2 antibody molecule, the number of different combin 
tions is 750 x 43,056 = 32 million. This estimate has neglecti 
additional sources of diversity that are substantial but difficult 
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ouantitate: the insertion of N and P nucleotides. However, even 
neglecting these factors the exercise demonstrates how nature has 
■.really enlarged the potential sequence diversity available from a 
limited number of total nucleotides by allowing flexible recombi- 
nation between different sequence elements. 

Although it is clear that the above mechanisms imply a vast 
repertoire, it is worth considering some qualifications that tend to 
reduce the actual combinatorial diversity, especially early in 
ontogeny. It seems unlikely, for example, that every possible com- 
bination of L and H chains yields a functional antibody molecule 
because in vitro L and H reassociation experiments show that cer- 
tain hybrid molecules (formed from L and H chains derived from 
different antibodies) are relatively unstable. Similarly, association 
of V and J (or V, D, and J) is conceivably not completely random. 
Evidence of striking bias in the selection of VH genes in fetal pre- 
B hvbridomas has been mentioned. In mice these hybndomas are 
biased toward the use of genes from the VH7 1 83 and VQ52 fami- 
lies In addition, fetal and newborn VDJ junctions show a paucity 
of N nucleotides and a tendency to form VDJ junctions across short 
stretches of sequence identity between the recombimng sequences 
(homology-mediated recombination, discussed earlier in this chap- 
ter) Both of these effects reduce diversity at the recombination 
■unctions, perhaps reflecting a mechanism that ensures the produc- 
tion of certain antibodies advantageous for young individuals. The 
neonatal bias toward usage of VH7183 and VQ52 families is not 
observed in adult B cells, but this bias raises the possibility that 
other less striking recombination biases may exist in adults, reduc- 
ing the actual diversity below that calculated on simplistic assump- 
tions. It has been reported, for instance, that mouse Jk rearrange- 
ments use JKl and Jk2 preferentially (398,399), whereas human B 
cells use JH4 preferentially (400) so that the combinatorial contri- 
bution of the available J regions to diversity is probably less than it 
would be if all were used equally frequently. 

Somatic Mutation 

Some early arguments suggesting the existence of somatic muta- 
tion in antibody genes were based on claims that estimates of com- 
binatorial diversity (as computed along the lines of the above exer- 
cise) were, although vast, still too small to account for the observed 
number of different antibodies. The latter number might be esti- 
mated from the percentage of B cells binding a particular antigen 
and the number of different antibodies— within that binding speci- 
ficity— that could be distinguished by isoelectric focusing, idiotype 
characteristics, or analysis of the fine specificity of antigen bind- 
ing. Such arguments based on global evaluations of diversity were 
superseded by studies of systems with restricted diversity, in which 
germline and expressed repertoire can be more reliably compared; 
these studies generated convincing evidence for somatic mutation^ 
The brief account below summarizes some of the major features of 
somatic mutation; this topic is discussed in detail in Chapter 24. 



Early Evidence for Somatic Mutation 

Analyses of amino acid sequences of murine XI chains from 
myeloma antibodies provided the first strong support for somatic 
mutation, even before the era of recombinant DNA analysis. Thus, 
when the amino acid sequences of XI chains produced by 21 inde- 
pendently derived myelomas were analyzed (401,402), 12 were 
found to be identical, representing a prototype VX1 sequence. The 
remaining variants were each unique, generally differing from the 
prototype sequence by single amino acid substitutions that could be 



accounted for by single base changes. Significantly, all but one oi 
the amino acid substitutions were unique to a single variant 
sequence The investigators concluded that the prototype sequence 
corresponded to a single germline gene, whereas the variants arose 
by somatic mutation of this gene. This interpretation seemed con- 
sistent with the observation that each variant sequence occurred 
only once, whereas several occurrences of the same sequence 
might have been expected if there were several germline VX1 
genes Now that gene cloning has confirmed that there is only a 
single VX1 gene, the identification of the variants as products of 
somatic mutation has been verified. 

Subsequent studies led to similar conclusions for mouse VK or 
VH systems involving small V families whose germline members 
could be readily cloned. An example of such a system is the rela- 
tively restricted murine antibody response to phosphorylcholine 
(PC) Sequence analysis of a panel of PC-binding hybndomas and 
myelomas expressing a similar VH sequence showed that all IgM 
antibodies shared a single prototype sequence (403). In contrast^ 
some IgA and most IgG VH regions showed scattered amino acid 
substitutions with respect to the prototype sequence. All of the 
sequence variants were unique to single cell lines. By analogy to the 
VX. system discussed above, these comparisons suggested that the 
prototype sequences reflected a germline gene, whereas the variants 
were products of diverse somatic mutations. A search of the four 
germline VH-region genes homologous to the prototype expressed 
VH gene showed only one gene that could have served as a precur- 
sor for the PC-binding VH regions; and this one matched the proto- 
type sequence exactly (404). The fact that the variant VH sequences 
were seen only in IgA and IgG, not in IgM, is consistent with the 
fact that IgM is characteristically produced early in the immune 
response, whereas somatic mutation occurs later in the response 
overlapping the stage of isotype switching; other studies have 
shown that somatic mutation can be seen in IgM at a low frequency. 

Role of Hyper mutation in Immune Responses 

To understand the role of somatic mutation in the antibody 
response, several groups have studied the extent of somatic muta- 
tion at different times after the immunization of mice. Studies ot 
the responses to p-azophenylarsonate (Ars), phosphorylcholine, 
influenza hemagglutinin, oxazalone, and several other antigens 
have all indicated that the initial response after primary immuniza- 
tion is contributed by antibodies showing no somatic mutation. 
About 1 week after immunization, mutated sequences begin to be 
observed, increasing during the next week or so. Booster immu- 
nizations yield sequences showing additional mutations. 

Many hybridomas made late in the immune response produce 
mutated antibodies with a higher antigen affinity than the unmu- 
tated (sometimes loosely called germline) antibodies made early 
after immunization. The shift to higher affinity is a phenomenon 
long recognized at the level of (polyclonal) antisera and has been 
termed "affinity maturation." This phenomenon can now be 
explained as the result of an evolutionary mechanism selecting anti- 
bodies of progressively higher affinity from the pool of randomly 
mutated V sequences. According to this model, at the time of iniual 
antigen exposure an animal has a set of B lymphocytes expressing 
germline (unmutated) versions of Ig sequences resulting from gene 
rearrangements that occurred before immunization. Because of the 
diversity of available VH, D, JH, VL, and JL sequences as well as 
the impressive recombinational potential described earlier, some B 
cells will express Ig molecules capable of binding the antigen with 
modest affinity. These cells are stimulated (by antigen binding) to 
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proliferate and to secrete antibody. Activated B cells located in lym- 
phoid follicles also bind antigen and receive T-cell help; at some 
point in the response the somatic hypermutation machinery is acti- 
vated in these cells, generating random mutations in the Ig genes of 
stimulated cells in the GCs. Many of these mutations can be 
expected to reduce the resulting antibody's affinity for antigen; 
indeed, such mutated antibodies with markedly reduced affinity 
have been demonstrated (405), as have mutated antibodies that have 
acquired autoantibody specificity (406). As antigen clearance 
reduces antigen concentrations seen by the lymphocytes, only the 
cells displaying high affinity antibody will be effectively stimulated 
by antigen; cells displaying lower affinity antibodies or antibodies 
with affinity for self antigens may be subjected to programmed cell 
death (apoptosis) (407-409). The preferential proliferation of the 
high-affinity cells and their maturation to secreting plasma cells 
will be reflected in an increase in the average affinity of the anti- 
bodies in the serum. These high-affinity cells will be left as the pre- 
dominant population to be represented as memory cells when anti- 
gen exposure ceases; they thus can induce the rapid, high-affinity 
response on secondary antigen exposure. In this model the driving 
force for affinity maturation— analogous to natural selection in the 
evolution of species— is selection for high antibody affinity in the 
face of low antigen concentration. The importance of this selective 
force is suggested by the observation that repeated injection of anti- 
gen can inhibit affinity maturation, as though by abolishing the 
selective pressure for high affinity (410). 

Cellular Context of Somatic Mutation 

Somatic mutations occur primarily in B cells of the GCs of lym- 
phoid tissues (411,412), particularly in a subpopulation of B cells 
known as centroblasts. These cells proliferate in the "dark zone" of 
the GC and bear characteristic surface markers, including IgD, 
CD38, and the receptor for peanut agglutinin (413,414). Each GC 
appears to be populated by a small number of antigen-specific 
founder B cells (412) and an unusual Thy- 1 -negative T-cell popu- 
lation, also antigen specific (415). The GC environment promotes 
contact between the B cell and follicular dendritic cells (FDCs) 
which store, process, and present antigen, and T-lymphocytes, 
which activate somatic mutation in part via CD40-CD40L interac- 
tion (416). Proliferating GC centroblasts give rise to centrocytes, 
which are programmed for apoptosis unless they are rescued by 
FDC-presented antigen and T-cell activation via CD40 engagement 
(409,417). It is at this stage where positive selection for high-affin- 
ity antibodies occurs via apoptosis of cells expressing low-affinity 
antibodies, yet paradoxically apoptosis is also promoted by soluble 
antigen, perhaps functioning to select against autoantibodies 
(408,418,419). As mentioned earlier, receptor editing may be 
another fate for autoantibody-producing cells in GCs. The features 
of antigen signaling that select for survival versus apoptosis or 
editing are not fully understood. Susceptibility of GC cell popula- 
tions to apoptosis is correlated with their expression of Fas, Bax, 
{553, and c-myc, all of which promote apoptosis, as well as down- 
regulation of the apoptosis suppressor Bcl-2. B cells of mice with 
engineered up-regulation of Bcl-2 expression can escape selection 
against autoreactivity (419a). 

Germinal center B cells may undergo several successive cycles 
of mutation followed by selection. Such a scheme is suggested by 
the sequence analysis of mutated Ig genes PCR-amplified from 
single cells microdissected from a histologic section of a GC (420); 
resulting sequences can be organized into genealogical i trees con- 



sistent with several stages of somatic mutation. Additional evi- 
dence for successive mutations has been reported in purified mem- 
ory B cells (421). A computer simulation has affirmed the high 
efficiency of alternating periods of somatic mutation and mutation- 
free selection as a strategy for generating high-affinity antibodies 
(422). Despite the evidence that somatic mutation occurs normally 
in GCs, mice lacking histologically detectable GCs as a result of 
lymphotoxin-a-deficiency are capable of affinity maturation and 
somatic hypermutation (422a). 

Distribution and Targeting of Mutations 

To explore the mechanism of somatic hypermutation, several 
groups have examined the distribution of mutations around Ig 
genes by comparing the sequences of somatically mutated 
rearranged genes to their germline precursors. Mutations occur not 
only in sequence derived from the germline V coding sequence, but 
also in the J region and nearby flanking intron sequence derived 
from upstream of the C-region gene. The somatic mutations seem 
to cluster in the V(D)J region, extending upstream no further than 
the RNA initiation site (with few exceptions) and tapering ofl 
downstream to define a target domain of about" f.S"kb. Therefore, 
for VDJ units involving the 3' JH4 segment, mutations extend far- 
ther downstream than for units involving JH1 (423,424). The foca 
nature of the mutations suggests a specific Ig hypermutatior 
mechanism that recognizes some feature of the DNA in or near tht 
VDJ sequence as a target for mutations. 

Exactly what feature of the V(D)J locus targets the hypermuta 
tion machinery is not understood. Unrearranged Vk, VH, and D. 
regions are generally not mutated, suggesting that the functiona 
target probably includes elements contributed by both V and 
(425-427); however, unrearranged VX regions can be mutate- 
(428). This difference may be related to the fact that unrearrange. 
VX genes are transcribed in B cells (429), whereas unrearrange 
Vk genes are not (275). Therefore, the element that is contribute 
by V(D)J recombination in support of Vk and VH hypermutatio 
may be the proximity of the V-region promoters to enhancers lyin 
near the C region, which can increase transcription. The specif! 
chromosomal location of Ig genes does not seem to be necessar 
for hypermutation because transgenic mice carrying a rearrange 
expressible Ig gene — presumably inserted randomly in tr 
genome— show somatic mutations in copies of the transger 
cloned from hybridomas (430). 

The appearance of hypermutation in transgenes has allowed tu 
ther experimentation on the sequence requirements for mutatic 
through studies of the effects of altered transgene structure on tl 
mutation rate. The importance of transcription in targeting hype 
mutation is reinforced by studies of transgenic constructs eng 
neered with or without either of the two transcriptional enhance 
associated with the K locus: the "intronic" enhancer lying betwei 
the Jk segments and Ck, and the downstream enhancer lying 
from Ck. Rearranged K transgenes that included the downstream 
enhancer and other downstream elements were more highly tra 
scribed and better somatic mutation targets than similar construt 
lacking these regions (431,432,432a,432b), whereas removal oft 
intronic enhancer essentially abolished hypermutation (432). Fi 
thermore, a VkJk-Ck transgene in which a duplicate copy of t 
Vk promoter was engineered upstream from the Ck region v 
found to incur mutations over 1.5-kb domains downstream fr< 
both promoters; the extra promoter created a new mutation dom; 
extending into the Ck region (433). However, the promotion 
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hypcrmutation does not seem to be specific to Ig promoters 
because replacement of the Vk promoter with the |3-globin pro- 
moter did not abolish hypermutation (432); hon-Ig enhancers also 
can promote hypermutation (434). Furthermore, the Vk coding 
sequence can be replaced by a human (3-globin gene or prokaryotic 
mo or gpt gene without affecting the hypermutation rate down- 
stream from the promoter (435). In contrast, a similar transgenic 
construct in which the Vk gene was replaced by the CD72 gene 
was not targeted for hypermutation despite high levels of tran- 
scription (436), and even a highly expressed VX-CX. transgene was 
not mutated (437). To summarize, it appears that transcription is 
necessary but not sufficient for targeting hypermutation, and addi- 
tional requirements have not been defined as of this writing. Cur- 
rently available data leave open the possibility that targeting of V 
genes for somatic mutation is not very specific and that some non- 
Ig genes that are transcribed in GC B cells also may be subject to 
mutation (438). Somatic mutations observed in the bcl6 gene may 
represent an example of this phenomenon (438a). 

Because mutations are not confined to hypervariable (CDR) 
regions and sometimes even occur in introns, it is apparent that the 
hypermutation mechanism does not distinguish coding from non- 
coding regions, let alone hypervariable regions from framework. 
The apparent clustering of mutations in the CDRs of sequenced Igs 
may be partly a result of selection for cells expressing primarily 
CDR mutations, either because framework alterations interfere 
with the basic folding of the protein or because CDR mutations can 
lead to higher affinity for antigen and thus stronger activation to 
clonal expansion, as discussed above. However, in Ig genes that are 
not selected for function (e.g., nonproductively rearranged VDJ 
alleles or passenger transgenes engineered with stop codons to pre- 
vent expression as a protein), mutational hot spots as well as cold 
spots have been recognized, apparently due to local DNA features 
that may promote or suppress somatic mutation within the domain 
of DNA targeted for hypermutation. It is possible that evolution has 
selected for sequences that create mutational hot spots in CDR 
regions to enhance the potential for diversity generation in the parts 
of the protein critical for antigen contact (439,440). 

Molecular Mechanism of Hypermutation 

The molecular mechanism of the mutations remains obscure. The 
observed mutations have shown little about what may have caused 
them. All four nucleotides have been targets for mutation, and all 
have been products. Both transitions (purine-purine and pyrimi- 
dine-pyrimidine interchanges) and transversions (purine-pyrimi- 
dine interchanges) have been observed, with apparent preferential 
targeting of G-C base pairs (441). Small insertions and deletions 
rarely occur. Significantly, in an unselected passenger Vk transgene, 
A and G nucleotides were mutated more frequently on the coding 
strand than on the noncoding strand (442); this strand polarity — 
also observed in human VH regions (443) — suggests that the muta- 
tion mechanism may be affected by a process that can distinguish 
between the strands, such as transcription through the V region. 

One report has argued that somatic mutations in an expressed 
mouse VH gene occurred by gene conversion, i.e., the clustered 
changes were templated by a nearby related VH region whose 
sequence agrees with all the observed mutations in the expressed 
gene (444). Apparent gene conversion also was observed in a 
mouse strain carrying a transgenic gene construct designed to opti- 
mize the possibility of conversion events (445). Clearly gene con- 
version seems to play a major role in somatic diversification of 



chicken and rabbit V genes (446,447) and probably pig V genes as 
well (448), and it may play a role in the evolutionary diversifica- 
tion of the germline repertoire (449,450); but no further evidence 
supporting a role for gene conversion in somatic diversification of 
murine or human Ig genes has been reported, even in cases where 
such conversion events might be easily detected (451). Indeed, 
gene conversion could not explain many examples of somatic 
mutation — e.g., in VA,1 genes and in the J regions and associated 
introns— because no closely related but different sequences are 
present in germline DNA that could donate the mutated nucleotides 
found in these regions of rearranged genes. This argument also 
applies to the prokaryotic transgenes targeted for hypermutation in 
the experiments described above. 

The observation that the sequences near somatically mutated 
nucleotides in Ig genes include direct repeats and palindromic 
sequences has led to the suggestion that these may play a role in 
somatic mutation (452). It also has been proposed that mutations 
may be generated by an error-prone polymerase during repair of 
nicks or gaps in the DNA (453). Because patients or mice with sev- 
eral defects in DNA repair seem competent for Ig somatic hyper- 
mutation, the affected genes are apparently dispensable for this 
process (454,454a). 

A recent model envisions the mutations as a consequence of tran- 
scription-coupled repair (433). In one version of this model, a muta- 
tor protein specific to GC B cells might load onto the transcriptional 
complex at the promoter and cause pausing of the complex at vari- 
ous positions; this pausing would induce gratuitous transcription- 
coupled repair that would occasionally produce errors. Multiple 
rounds of transcription in each cell could offer repeated opportuni- 
ties for mutation by this mechanism. In each round of transcription, 
the mutator protein would fall off the transcription complex as a sto- 
chastic event during progression of the complex downstream, thus 
accounting for the irregular decline in the mutation frequency at 
increasing distances from the promoter. Such a model would be con- 
sistent with the strong correlation between the transcription initiation 
site and the 5' boundary of mutations (454b,454c). 

The product of the mismatch repair gene Pms2 (homologous to 
mutL in E. coli) has been implicated in somatic mutation by a 
recent experiment: a Pms2 knockout allele was bred into a mouse 
strain-— the quasimonoclonal or QM mouse — engineered with 
rearranged VkJk and VDJ genes knocked-in to the respective 
germline loci by homologous recombination (454d). Although the 
Pms2 mutation causes a general increase in mutation rate in most 
tissues (454e), the immunoglobulin genes in B cells showed sig- 
nificantly fewer somatic mutations than were seen in the QM 
mouse with normal Pms2 (454d), suggesting that Pms2 activity 
contributes to Ig gene hypermutation. Mismatch repair machinery 
could hypothetically participate in Ig gene hypermutation by 
switching its usual preference for correcting the newly synthesized 
strand, instead preserving any mutations in this strand by "correct- 
ing" the opposite strand. 

An important, but as yet unclarified, role for IgD in somatic 
mutation is suggested by the observation that mice with a homozy- 
gous targeted disruption of their C5 gene were impaired, although 
not completely deficient, in affinity maturation (455). Conversely, 
an IgM~IgD + subset of GC B cells from human tonsils were found 
to accumulate extremely high numbers of somatic mutations (456). 

Investigations of somatic mutation should be facilitated by 
recently described systems for observing the process in vitro in pri- 
mary B cells (454f,457) or cell lines (441,458) and by the devel- 
opment of rapid methods for detecting somatic mutation (459). 
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Immunoglobulin Gene Evolution: Varying Roles 
for V Gene Assembly 

Evolution of the Immunoglobulin Superfamily 
and V Assembly Recombination 

The three families of Ig genes (k, X, and H chains) and the 
closely related four families of TCR genes (a, P, Y, and 5) clearly 
provide a powerful and flexible molecular defense mechanism that 
is valuable for survival in the face of the diverse pathogenic 
microorganisms that abound in our environment. How did such a 
complex and elegant system evolve? One obvious approach to elu- 
cidating Ig gene evolution is to infer genetic history from compar- 
isons of the Ig gene systems in different modern species. The more 
ancient history of these genes may be approached by examining 
homologous non-Ig genes. The ever-growing number of non-Ig 
genes that demonstrate sequence similarity, and therefore pre- 
sumed homology, to the Ig genes has become known as the Ig 
superfamily (460,461). This family name reflects the fact that the 
Ig genes were the first members to be sequenced and does not 
imply a functional relationship of the superfamily to Igs or to the 
immune system. The hallmark of the Ig superfamily is the Ig 
domain: approximately 100 amino acids, generally encoded in a 
single exon, and including an internal disulfide loop spanning 
roughly 60 to 70 amino acids. Despite some rather tenuous primary 
sequence similarities, the Ig domains are all assumed tcr share 
approximately the same three-dimensional structure found in Igs, 
comprising seven roughly parallel strands forming two layers of 
pleated sheets. This assumption has been confirmed for several 
members of the superfamily, including (52-microglobulin, CD4, 
TCR-ot and -(J chains, and the ct3 domain of MHC class L 

Almost all of the Ig superfamily members are surface proteins 
that function by contacting other surface proteins in cell-cell inter- 
actions. Because Ig superfamily members mediating such interac- 
tions are found in even the most primitive metazoan organisms — 
e.g., cell adhesion molecules in slime molds (462)— the Ig domain 
is likely to be truly ancient, significantly predating the function of 
superfamily members in defense against microbial invasion. On the 
other hand, several invertebrate superfamily members have been 
described that do play a role in microbial defense, e.g., the mollus- 
can defense molecule (463) and insect hemolin (464), which may 
even share with Ig genes some features of gene regulation by Rel 
family transcription factors (465). Such examples suggest that 
some members of the superfamily may have functioned in a prim- 
itive immune system in a common ancestor of molluscs, insects, 
and vertebrates. 

Although the Ig domains of most currently known superfamily 
members are encoded in single exons, several examples (e.g., CD4, 
N-CAM, and the Xenopus CTX protein) are encoded in two sepa- 
rate exons (466,467). It is uncertain whether this structure re- 
flects an origin of the Ig domain from association of two primordial 
half-domains by exon shuffling (with later loss of the intron in most 
current superfamily genes) or the introduction of an intron into a 
preexisting single-exon Ig domain. Several of the distantly related 
superfamily members appear to be more C-like or V-like, suggest- 
ing that they arose after the divergence between the primordial C 
domain and V domain. However, this sequence of events is not 
definitively established, and examples of both C- and V-like genes 
are known that show the divided half-domain structure. 

Do the separated V-region elements (V, D, and J) found in mod- 
ern Ig (and TCR) genes reflect an association of originally unre- 
lated sequences or fragmentation of elements that were contiguous 



in an ancestral gene? Suggestive observations bearing on this ques- 
tion have come from an analysis of the CD8 gene. A genealogic 
relationship between the CD8 antigen and k Ig is suggested not 
only by sequence similarity but also by the linkage of their genes 
on chromosome 6 of mice and chromosome 2 of humans. The CD8 
gene has been found to include a segment of J-like sequence con- 
tiguous with the V-like sequence (468), suggesting that V and J 
sequences may have been contiguous in a primordial ancestor 
gene. The presently observed separation of V and J may then have 
resulted from insertion of DNA between them by a transposition- 
like event (Fig. 17), as originally proposed by Sakano et al. (6). A 
second similar insertion may have separated D sequence from the 
germline V, as shown in the Fig. 17, although an alternative hypoth- 
esis involving a single insertion event also has been proposed 
(469). In order for the V, (D), and J regions to be reassembled after 
they were rendered noncontiguous by the transposition event, one 
would need to assume the prior or simultaneous development of a 
mechanism — presumably based on the RAG proteins — for V(D)J 
recombination. Because there is no evidence for RAG-like genes in 
primitive species without V(D)J recombination, simultaneous 
acquisition of the RAG genes and the insertion separating germline 
V, (D), and J elements appears reasonable, lending favor to the 
speculation that the RAG genes may have been carried on a trans- 
posonlike element — flanked by 7-mer and 9-mer RSS repeats or 
both ends — that inserted into a primitive V region. Such a transpo* 
son might have derived by lateral transfer from a prokaryotic ele- 
ment. Presumably the sequence inserted roughly 400 million yean 
ago into the genome of a primitive cartilaginous fish because RAO 
genes (208), as well as V(D)J recombination of Ig and TCR genes 
are found in modern sharks and all higher vertebrates examinee 
(470-472); but none of these parameters are found in the slighth 
more primitive lamprey and hagfish. Interestingly, the shark RAG 
1 gene shows sequence similarity to the INT (integrase) gene o 
phage X (as well as to the yeast DNA repair proteins RAD 16 an< 
18 and the human breast cancer susceptibility gene BRCA1) 
whereas RAG-2 shows sequence similarity to the bacterial integra 
tion host factor gene. The similarities to modern prokaryotic gene 
with a recombination-related function strengthens the hypothesi 
of a prokaryotic source for the RAG genes. Moreover, as noted eai 
Her, the mechanism of RAG-catalyzed DNA rearrangement — wit 
a hairpin intermediate — bears some similarity to prokaryotic DR 
recombination mechanisms. 

V(D)J recombination may have provided primordial Ig supei 
family genes with their first potential for somatic diversificatioi 
i.e., variable junctions resulting from the flexibility of the recon 
bination position. Presumably, as soon as diversity became fun< 
tionally important for recognition of specific foreign antigen 
mechanisms for clonal activation would have been developed ar 
allelic exclusion would have become important to focus the spec 
ficity of the response. Conceivably these features arose before tl 
divergence between Ig and TCR genes. 

Diverse Evolutionary Mechanisms for Diversity 

Ig genes of the shark, chicken, and rabbit provide interestii 
contrasts to the more familiar evolutionary paths taken by mice ai 
humans. The shark H-chain locus consists of multiple duplicates 
~10-kb units containing separated V, D, J, and CH elemer 
(65,473). The V, D, and J elements are associated with recombin 
tion signal elements similar to their mammalian homoloj 
Sequence comparisons between duplicated units demonstrate d 
ferences not only between the germline V genes but between t 



Immunoglobulins: Molecular Genetics / 147 



CH genes as well. If VDJ recombination occurs only within one of 
these repeat units, as has been presumed, then diversity would 
derive from junctional flexibility but not from combinatorial mul- 
tiplication; sharks also would lack the level of diversity afforded by 
the isotype switch because a particular V region would always be 
associated with a specific CH region. This limited system allows 
the shark to mount specific antibody responses, but these responses 
show remarkably little variation between individuals, although 
somatic mutation does occur. Presumably, the mechanism for 
clonal selection is operative in sharks. 



Diversity generation in chickens follows a still different scheme. 
The X system has been particularly well studied (474,475) and in the 
germline consists of a VX.1 gene 1.7 kb upstream from a typical JX- 
CX unit. All the expressed X protein apparently derives from VJ 
recombination involving this VXl gene. Upstream of YX\ he 25 VX. 
pseudogenes. These pseudogenes cannot themselves encode viable V 
regions and do not rearrange with the TX segment However, they 
contribute to diversity by donating stretches of their sequence to the 
rearranged VXl by a somatic gene conversion process; expressed VX 
sequences show multiple patches of sequence that differ from VXl 
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FIG. 17. Evolution of Ig genes. The Ig superfamily presumably evolved from a primordial cell interaction domain by 
multiple rounds of duplication followed by individual mutation and specialization by different duplicated copies. 
Because both Ig and TCR systems share specialized V and C domains and V assembly recombination, these fea- 
tures probably evolved before divergence of these two antigen receptor systems. The evolution of the split V region 
(requiring assembly of V, D, and J to form a functional domain) could have resulted from a single event inserting a 
D sequence and flanking DNA between V and J of a primordial V domain (pathway on right) or by two separate inser- 
tion events (pathway on left). The primordial H-chain gene evolved by different pathways of duplication in shark and 
mammalian lineages. Sharks show duplication of the entire VDJC unit, whereas in mammals separate duplication of 
each of these elements occurred. 
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but precisely match specific pseudogene sequences. Thus, in contrast 
to the somatic hypermutation observed in mice and humans, indi- 
vidual expressed chicken X genes show no evidence of random point 
mutations, and no sequence alterations are found in the intron 
sequences flanking the rearranged VJ unit. Although combinatorial 
joining diversity is completely absent in this system, the chicken is 
capable of a highly heterogeneous X response as a result of multiple 
rounds of gene conversion events operating in different regions of 
the V segment. A similar gene conversion mechanism is important in 
generating diversity in the chicken H-chain system. 

Rabbit Igs might have been expected to follow the schema 
demonstrated for the homologous loci from the two most inten- 
sively studied mammals (mice and humans), but the facts are more 
interesting. A particularly puzzling feature of rabbit Ig relates to 
expression of VH allotypes. From 70% to 90% of rabbit antibodies 
display one of three serologically defined allotypes known as al, 
a2, or a3. Rabbits that express one predominant allotype pass this 
characteristic to their progeny as though a single gene with three 
alleles were being transmitted as a Mendel ian codominant trait; 
however, Southern blots of rabbit genomic DNA showed several 
hundred VH-hybridizing bands. How could the simple inheritance 
of allotype expression be explained given the large number of VH 
genes? The answer, as demonstrated primarily by Knight and col- 
leagues, is that rearrangements of the most D-proximal VH region, 
designated VH1 , account for most of the H chains expressed in rab- 
bits, and this VH gene encodes the specific amino acids that define 
the VHa allotype (476). The other VH-region segments contribute 
to diversity primarily by gene conversion events that alter the VH1 
sequence (447); somatic point mutations apparently occur as well 
(476a). These upstream VH regions may occasionally rearrange 
productively, perhaps accounting for the 10% to 30% of VHa allo- 
type-negative antibodies in normal rabbits. The potential for such 
recombination is suggested by a strain of rabbits (Alicia) in which 
the VH1 gene was deleted; this strain nevertheless makes normal 
amounts of antibody, most of which is VHa allotype negative. Gene 
conversion also contributes to diversity in bovine Igs (477). 

Undoubtedly, examination of the Ig genes of other organisms 
will provide additional details of the evolution of these remarkable 
loci and a better understanding of the differing strategies for the 
generation of diversity. Such studies also should help to elucidate 
the evolutionary relationship between Ig genes, the homologous 
TCR genes, and other Ig superfamily members not involved in 
immune defense. 

REGULATION OF IMMUNOGLOBULIN 
GENE EXPRESSION 

General Principles of Gene Regulation 

The mechanisms that regulate the expression of Ig genes have 
been under intense investigation in recent years as part of a wide 
effort to understand development and differentiation in molecular 
terms. Immunoglobulins are synthesized only by B-lymphocytes; 
even within this lineage these proteins are made in differing 
amounts at different developmental stages. Although rates of pro- 
tein synthesis can be regulated at the levels of mRNA transcription, 
processing, transport, stability, and translation, most attention has 
focused on transcription because this seems to be the limiting step 
in most systems that have been examined [although changes in 
mRNA stability can clearly play an important role (478)]. The gene 
loci encoding lymphocyte antigen receptors (Ig and TCR) are 
unique in that the complete genes do not exist in the earliest stages 



of lymphocyte maturation; only the germline precursors are present. 
Thus, the regulation of Ig gene expression must be integrated with 
the progress of Ig gene rearrangements. These processes are further 
intertwined because as discussed elsewhere in this chapter, tran- 
scription is apparently required for Ig gene rearrangement, both for 
V(D)J assembly and isotype switch recombination. Thus, the tran- 
scriptional regulation of nonrearranged loci also merits analysis. 

Cis Regulation 

Gene transcription can be regulated by cis influences — depen- 
dent on the DNA sequence of genetic elements attached to a 
gene — and trans influences, dependent on the environment of the 
gene. For most genes, regulatory studies have initially focused on 
the cis elements that regulate gene expression. Some insights have 
been gained by examining how gene expression is affected by 
spontaneous mutations or deletions of regulatory sequences in cells 
or animals. However, most advances have been made by inserting 
putative regulatory elements into DNA constructs containing a 
reporter gene — one whose expression can be conveniently 
assayed — and then transfecting the constructs back into eukaryotic 
cells; the function of the putative regulatory elements is then tested 
by assaying for reporter gene expression. In some experiments the 
assays are performed only 2 to 3 days after transfection, an interval 
so short that most of the DNA remains in an unstable episomal 
form; these are known as transient trans fections. In contrast, other 
experiments are designed to produce stable trans fectants in which 
the engineered DNA construct becomes inserted into the cell chro- 
mosomes. As an alternative to transfection of cells, similar con- 
structs can be introduced into the mouse genome, thereby creating 
strains of transgenic mice. The expression of the introduced trans- 
gene can then be assessed in a variety of tissues in the animal to 
examine whether the candidate regulatory element can program the 
same pattern of tissue-specific expression as that observed for the 
gene from which the element was derived. 

Through such transfection and transgene experiments three 
major classes of eukaryotic cis regulatory elements have been 
defined. A promoter is a DNA segment that is located near the tran- 
scriptional initiation site and that promotes the initiation of RNA 
transcription in a specific direction, i.e., toward the coding 
sequence of the gene. An enhancer is a DNA segment that can 
stimulate transcription when positioned at variable distances from 
the transcription initiation site and in either orientation. A silencer 
downregulates transcription, operating (like an enhancer) in both 
orientations and over variable distances via mechanisms not thor- 
oughly understood. All three kinds of elements are generally active 
in only certain cell types and thus participate in regulating the tis- 
sue-specific expression of the associated gene. Two other types of 
cis elements have been characterized in eukaryotic chromosomes 
and should be noted. Matrix attachment regions (MARs) attach 
DNA to the chromosomal scaffold proteins and may promote local 
impairing of the DNA strands (479,480). Locus control regions 
(LCRs), first discovered in the p-globin locus (481), are complex 
regulatory regions that are composed of smaller elements that indi- 
vidually have enhancer function. LCRs affect chromatin structure 
and gene activity over longer distances than enhancers are thought 
to act. Operationally they are defined by their ability — when tested 
in transgenic constructs — to program associated reporter genes for 
expression independent of the position of integration into chromo- 
somal DNA; in contrast, constructs without LCRs generally are 
expressed at widely different levels in different transgenic mouse 
strains depending on integration site. 
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Figure 1 8 provides an overview of the currently known regula- 
tory sequences of the Ig loci in the mouse (similar regions have 
been reported for most of the homologous human loci). Promoters 
are present in the flanking DNA just upstream of each V gene in all 
three loci: k, X, and H chain. In plasmacytomas only the promoter 
of the rearranged V region is active, whereas similar promoters of 
unrearranged upstream Vk or VH regions are inactive. This obser- 
vation provoked a search for an additional regulatory sequence 
downstream frorn the J that might activate the promoter of the adja- 
cent rearranged V region. The J-C regions of the k and H-chain loci 
were screened for regulatory regions, and enhancers were found in 
J c -C introns of both loci. (The J-C introns of A. loci apparently lack 
enhancers.) Near the intronic enhancers of the K and IgH loci, 
silencer regions have been reported that may inhibit the activity of 
the associated enhancers in non-B cells. After the discovery of 
intron enhancers, two observations led to expectations of additional 
enhancers 3' from the C-region genes. First, several myelomas were 
found to have undergone spontaneous deletions that eliminated the J- 
C intronic enhancer of the expressed H-chain gene, but the myelo- 
mas nonetheless continued to express this gene at normal levels; 
these observations were consistent with the presence of an additional 
enhancer in the DNA that had not been deleted. Second, enhancers 
were discovered downstream from C-region genes in the related fam- 
ily of TCR genes. Subsequent investigation uncovered enhancers 3' 
from K and X Crregion genes, as well as an enhancer 3' from the Ca 
gene, the most downstream constant gene in the H-chain locus. 

These enhancers, silencers, and the V-region promoters may be 
sufficient to explain the transcription of complete, assembled Ig 
genes; but additional germline or sterile transcripts are transcribed 
from Ig C-region genes that are being activated for V assembly or 
isotype switch rearrangements. These transcripts are also controlled 
by promoters (Fig. 1 8), which in some cases have been found to be 
critical for regulation of the associated DNA rearrangement. 

Promoters, enhancers, and silencers are composed of clusters of 
several short sequence motifs, each of which can be recognized by 
a specific nuclear protein (or proteins). Some of these motifs are 
present in more than one enhancer or may even be shared between 
enhancers and promoters. In the discussion below, several of the 
important murine regulatory regions and their functional motifs are 
described, along with nuclear protein families known to regulate Ig 
gene expression by binding to these motifs. Each murine regula- 
tory region has been found to have an apparent homolog in 
humans, often with many of the same nuclear binding motifs con- 
served. The presence of multiple motifs in a given enhancer com- 
plicates the analysis of the role of any one motif. Engineered muta- 
tions in a particular motif often have very little effect on the 
activity of the complete enhancer, and sometimes an artificial con- 
struct containing a single functional motif often shows no enhancer 
activity on its own. Two strategies have been used to demonstrate 
the function of such motifs. In a construct with an enhancer frag- 
ment in which all but a few motifs have been deleted, the contribu- 
tion of each remaining element is often detectable through the 
effects of mutations. Alternatively, an artificial enhancer contain- 
ing several multimerized copies of a single motif may have 
enhancer activity when a single copy does not. The proteins that 
bind to enhancer motifs mediate the regulatory function by promot- 
ing' (or inhibiting, in the case of silencers) the assembly of tran- 
scriptional machinery at the promoter. The proposed interactions 
between proteins binding enhancer and promoter imply that the 
intervening DNA forms a large loop. Many regulatory proteins are 
present in the nuclei of only certain tissues or cell types, a fact that 
can in principle explain the cell type specificity of the transcription 



of particular genes. External stimuli that up- or downregulate Ig 
gene expression (e.g., cytokines or antigen binding) typically work 
by altering the amounts or activity of certain DNA-binding proteins. 

Types of Trans Effects 

Alteration in the nuclear content of DNA binding proteins that 
interact with cis regulatory elements represents a well-studied 
mechanism for trans regulation of gene expression, but other 
approaches to investigating trans regulation should also be men- 
tioned. One correlate of gene activation that formally falls in the 
class of trans effects is the altered chromatin environment of DNA 
in expressed genes that is often detectable by nuclease sensitivity 
experiments. In these experiments isolated nuclei are treated with 
varying concentrations of DNase I (or a variety of other nucleases, 
including restriction endonucleases) and the DNA is then purified, 
digested with a restriction enzyme, and analyzed by Southern blot- 
ting using a hybridization probe for the genes under study. In gen- 
eral, when the nuclei are derived from cells expressing a particular 
gene, that gene is more sensitive to DNase I than unexpressed 
genes in the same cells; i.e., a Southern blot band carrying the 
expressed gene can be abolished by treatment with low concentra- 
tions of DNase that leave unexpressed genes (or their Southern blot 
bands) relatively unaffected. In addition, appropriate Southern blot 
strategies show that certain segments of DNA in expressed genes 
may be hypersensitive to DNase I; these segments tend to coincide 
with regulatory regions of genes where sequence-specific binding 
by regulatory proteins blocks access of these DNA regions to 
nucleosomes, rendering them accessible to nucleases. 

Another chromatin correlate of gene activation is the extent of 
DNA methylation. Most cytosine residues within CpG dinu- 
cleotides are methylated in mammalian DNA, but genes that are 
actively expressed in a particular cell generally appear relatively 
undermethylated in that cell type (482). The extent of CpG methy- 
lation can conveniently be estimated for that subset of CpG dinu- 
cleotides that fall within the sequence CCGG, which is the recog- 
nition site for two restriction endonucleases: Msp I cuts at this site 
regardless of the methylation status of the internal cytosines, 
whereas Hpa II cuts only the completely demethylated site. South- 
ern blot strategies that exploit this difference have been used to 
compare methylation in active and inactive genes. Both ji and K C- 
region genes have been shown to be sensitive to DNase I and under- 
methylated in pre-B cells, B cells, and plasma cells but DNase 
resistant in nonlymphoid cells (483-487). Cells that are undergoing 
isotype switching show undermethylation of the C-region genes 
that are the targets of switch recombination (483,488); such under- 
methylation correlates with synthesis of germline transcripts from 
these CH genes. Of V-region genes in B cells, only the rearranged 
and transcribed V gene generally shows the undermethylation and 
DNase sensitivity characteristic of active genes (484,489). In the 
IgH locus, DNase I hypersensitivity sites have been found overlap- 
ping the intronic (486) and downstream enhancers (490,491). In the 
K gene, hypersensitivity sites occur at the promoter and enhancer as 
well as at a site 5' from the enhancer (492,493). 

Methods of Studying Trans-Acting Proteins 
Binding to cis Regulatory Motifs 

Recent studies have investigated trans-acting proteins identified 
by their interaction with known cis-acting promoters or enhancers. In 
vitro binding of nuclear proteins to specific regulatory sequence ele- 
ments can be detected by several techniques, some of which can be 
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FIG. 18. Enhancers and promoters of the murine lg loci. Schematic maps (not to scale) of the three murine Ig loci 
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silencer regions by black circles, and the various promoters by arrows indicating the direction of transcription. The 
enhancer reported upstream from murine DQ52 (572) is not shown in the graphic image. 



used to assess sequence-specific binding even in crude protein mix- 
tures. The simplest technique is the EMSA. In this assay a short (typ- 
ically 30-300 bp) radioactively labeled double-stranded DNA frag- 
ment is allowed to interact with a mixture of proteins extracted from 
cell nuclei by a salt solution; when the DNA is then electrophoresed 
in an acrylamide gel, binding of protein(s) to the DNA can be 
detected by the retarded mobility of the protein-DNA complexes in 
the gel in comparison with the mobility of the tree DNA probe. 
Sequence specificity of the retarded band must be demonstrated by 
showing (a) that its intensity can be diminished by adding to the 
incubation mixture an unlabeled competitor oligonucleotide identi- 
cal to the probe sequence, but (b) that a similar amount of oligonu- 
cleotide of unrelated sequence is without effect. Retarded complexes 
can be identified as containing an already characterized protein if an 
antibody to that protein specifically eliminates or supershifts (i.e., 
further retards the electrophoretic mobility) of the band. Another 
powerful technique, DNA footprinting, allows the visualization of 
the specific DNA sequence covered by a bound protein. In this tech- 
nique a protein preparation is allowed to bind to a fragment of DNA 
that has been radioactively labeled on one end of one strand. The 
DNA-protein mixture is then treated with DNase under conditions so 
mild that on average each strand will be nicked by the enzyme only 
once; then the DNA is purified from the incubated proteins and elec- 
trophoresed on a denaturing acrylamide gel (along with size mark- 
ers) in order to detect the positions of DNase-induced nicks in the 
radiolabeled strand. A nuclear protein that can bind tightly to the 
radioactive DNA fragment during the initial DNase incubation step 



protects the region of the DNA that it contacts from nuclease attack, 
and so the position of the bound protein can be inferred from a 
region of the fragment that is relatively free of nicks (the footprint). 

Once a protein has been detected that binds to a critical regula- 
tory element in the DNA, detailed study of the protein requires 
molecular cloning of its gene. Two main strategies have been used 
for such cloning. In one approach the protein is first purified by 
classic fractionation procedures. EMSA or DNase footprinting 
assays can be used to follow the binding protein through fractiona- 
tion steps. Typically, the purification includes an affinity column in 
which the DNA sequence representing the binding target is fixed to 
the column bed; the specifically interacting protein binds this DNA 
sequence with high affinity, separating it from contaminating 
material. When the protein is pure, amino acid sequences are 
obtained from tryptic fragments; these sequences are used to 
design DNA probes that can be used to isolate clones from a cDNA 
library. An alternative cloning strategy (494) bypasses the protein 
purification procedure. From a cell expressing the binding protein, 
a cDNA library is constructed using the vector A.gtll, a bacterio- 
phage engineered to allow transcription and translation of insert 
cDNA sequences in infected bacteria. A library of viral plaques 
imprinted onto membrane filters is screened by soaking the filters 
in a solution containing the target DNA binding sequence as a 
radioactively labeled fragment. A plaque that expresses a cDNA 
encoding the binding protein is able to bind the probe and thus cre- 
ates a radioactive spot on an autoradiograph of the filters. Clones 
identified by their position on the filters are then isolated for study. 
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The fact that a purified nuclear protein binds in a sequence-spe- 
cific manner to a regulatory DNA sequence does not prove that this 
protein mediates the regulatory function of the DNA sequence. 
However, a functional role for an enhancer-binding protein can be 
inferred if transfection of a clone encoding the protein induces 
transcription of a cotransfected reporter gene linked to the 
enhancer/promoter motif that is bound by the. cloned protein. Such 
experiments have verified the function of several Ig promoter- and 
enhancer-binding proteins, which can therefore be considered tran- 
scription factors. Some of these occur only in B-lymphocytes and 
thus can account in part for the B-cell specificity of Ig gene expres- 
sion. Others are more widespread. Many of these regulators of Ig 
genes are homologous to mammalian oncogenes as well as to genes 
of drosophila and yeast, suggesting ancient evolutionary origins 
and fundamental importance of these proteins in the regulation of 
cellular metabolism. 

Several of these proteins will be discussed below in connection 
with the cis regulatory elements with which they interact. Most of 
this discussion is based on analysis of murine Ig genes, which have 
been examined most extensively. 

Cis-Acting Elements .in V-Region Promoters 
The Octamer Motif 

The transcription of assembled Ig genes initiates upstream from 
the V gene sequences. The promoters that regulate this initiation are, 
by virtue of their upstream positions, present in each germline V- 
region gene even before V assembly recombination. Like many 
eukaryotic genes, most V gene promoters contain a TATA site about 
25 bp 5' from the initiation site; TATA sites serve as binding sites for 
the transcription factor TFIID and related proteins and thereby play 
a role in specifying the exact position where RNA transcription ini- 
tiates. The only other conserved feature of all classes of Ig V pro- 
moters (i.e., K, X, and H chain) is an octamer ATTTGCAT that is 
found associated with Vk and Vk genes, whereas the inverted com- 
plement ATGCAAAT is found 5' from VH genes (495,496). (The 
same sequence is sometimes identified as a decamer TNAl'l lGCAT 
or the complementary ATGCAAATNA.) Vk promoters generally 
include only the octamer plus the TATA box, whereas VH and VX 
promoters can include both of these motifs as well as other charac- 
teristic regulatory elements. The conservation of the octamer ele- 
ment in Ig V promoters suggested that it might play an important role 
in Ig gene function; indeed, when constructs containing this motif 
were analyzed by transfection, it became clear that the octamer is 
critical in conferring B-cell specificity to the promoter. Deletions or 
mutations in the octamer cause dramatically reduced B cell-specific 
promoter activity when tested either in Ig gene constructs or in het- 
erologous genes transfected into B cells (497-501). The octamer also 
has been shown to be required for optimal in vitro transcription by 
B-cell nuclear extracts, whereas it had no effect on transcription by 
HeLa extracts (502). (Detainers appear in the promoters of several B 
cell-specific genes other than Ig, including B29 (IgP) (503), CD21 
(504), and CD20 (505). A multimerized octamer can act as a B 
cell-specific enhancer (506). 

A puzzling feature of transcriptional regulation by the octamer is 
that this sequence is also a functional component of promoters of 
several other genes whose transcription is not B cell specific. These 
include the herpes thymidine kinase gene, histone H2B genes, and 
Ul and U2 small nuclear RNA genes. To understand the puzzling 
relationship between octamers in Ig and non-Ig promoters, several 
laboratories have undertaken analyses of the trans-acting nuclear 
proteins that bind to these elements. Two such proteins — designated 



Oct-1 and Oct-2 — have been extensively characterized, initially by 
EMS A experiments and subsequently by gene cloning (507-510). 
These two proteins show differing tissue distributions. Most cells 
make Oct-1 (also known as OTF-1 and NFA-1), but only B cells and 
a few other cell types (notably, activated T cells) make Oct-2 (OTF- 
2, NFA-2). Several additional octamer-binding proteins specific to 
other tissues (e.g., in neural cells or embryonic stem cells) have been 
reported. The Oct proteins share a similar 160-amino acid DNA 
binding domain, which explains their virtually identical * binding 
specificity. Amino acid sequences similar to this DNA binding 
domain have been found in several other nuclear proteins that bind 
to motifs resembling the octamer. This binding domain thus defines 
a family of nuclear factors, which has been designated the POU 
family (pronounced "pow"), named for the three factors in which 
this conserved domain was first noted: Pit-1 , Oct- 1/2, and the nema- 
tode gene wnc86. The domain includes a 75-to 80-residue POU-spe- 
cific domain (POUs), a short flexible linker, and a 60-amino acid 
segment (POU H ) homologous to the homeobox domain. (Home- 
oboxes were first recognized in genes regulating drosophila devel- 
opment, but more recently have been noted in genes throughout the 
animal kingdom and even in plants.) The POUs domain contacts the 
ATGC part of the ATGCAAAT sequence, whereas the POU H 
domain contacts the AAAT segment (511). 

The Oct proteins have been demonstrated to be transcription fac- 
tors by experiments in which the corresponding genes were 
cotransfected into fibroblasts or HeLa cells along with reporter 
gene constructs driven by octamer-containing promoters. Critical 
activation regions, necessary for the Oct proteins to stimulate tran- 
scription, have been deduced from the effects of deletions and 
mutations placed in different regions of Oct proteins (512) and the 
effects of swapping (through genetic engineering) various domains 
between Oct-1, Oct-2, and other POU proteins (513-515). On the 
N-terminal side of the POU domain, Oct- 1 and Oct-2 both contain 
a glutamine-rich activation region, but on the C-terminal side Oct- 
2 contains a feature missing from Oct-1: an activation region rich 
in serines, threonines, and prolines. Apparently the C-terminal dif- 
ferences are functionally important because swapping the N-termi- 
nal domains between Oct-1 and Oct-2 has little effect, whereas 
replacing the C-terminal domain of Oct-1 by that of Oct-2 confers 
a distinctive property of Oct-2: the ability to activate transcription 
from multiple octamer motifs functioning as either a promoter or 
an enhancer (506). 

The B-cell specificity of Oct-2 suggested that this factor might 
be important for activity of the octamer motif in V promoters, and 
some evidence supports this inference. However, targeted disrup- 
tion of the Oct-2 genes in a B-cell line (516) produced little effect 
on the expression of either endogenous Ig genes or a transfected 
gene driven by an octamer-containing promoter. Furthermore, 
although homozygous Oct-2 knockout mice (517) die without 
obvious pathology within a few hours of birth, they contain roughly 
normal numbers of B cells, which respond to activation by a T-cell 
clone with near normal cell proliferation and Ig secretion (518). 
These results suggest that Oct-2 is unnecessary for the V-region 
promoter activity required either for early B-cell development or 
forT cell-activated Ig secretion, possibly because of the redundant 
role of Oct-1 for these processes. On the other hand, cultured B 
cells from the homozygous Oct-2 knockout animals showed 
marked defects in LPS-plus-cytokine-activated Ig secretion and in 
anti-IgM-induced proliferation, suggesting a role for Oct-2- 
dependent proteins in these signaling pathways. 

The B-cell specificity of Oct proteins is complicated by their 
interactions with additional proteins. One such protein — designated 
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octamer coactivator from B cells (OCA-B) — was originally 
detected in affinity-purified preparations of either Oct-1 or Oct-2 as 
a factor necessary for optimal in vitro transcription from a Vk pro- 
moter (519). The purified protein has now been cloned by several 
laboratories, which use several different names for it: OCA-B (520), 
OBF-l (521), and Bob-1 (522). Binding of OCA-B to the 
octamer/Oct-protein complex apparently stimulates transcription 
through a transcriptional activation domain of OCA-B (523). The 
OCA-B protein binds to the POU domain of either Oct-1 or Oct-2, 
but also may contact the DNA at the fifth base of the ATGCAAAT 
sequence; oligonucleotides with alterations at that position bind the 
Oct proteins normally but cannot form a complex with OCA-B, nor 
can a reporter construct mutated at that position show OCA- 
B-induced stimulation of transcription (523,524). An important 
role of this protein for in vivo regulation of Ig production is sug- 
gested by the effects of OCA-B disruption by gene targeting 
(525,526). OCA-B knockout mice seem healthy and are fertile but 
show defects in B-ceil maturation and Ig production. The number of 
mature B cells in the spleen is reduced, and the response to immu- 
nization is dramatically impaired, with reduced proliferation and a 
severe decrease in IgG, IgA, and IgE, apparently due to decreased 
Ig gene transcription in B cells that have undergone isotype switch 
recombination. GCs are not formed in these mice. Some of these 
effects are apparently mediated by decreased Ig gene transcription, 
whereas others may result from interference with OCA-B-depen- 
dent expression of other genes. Purified OCA-B seems to bind pref- 
erentially to Oct-1 rather than Oct-2 (527); a second coactivator has 
been postulated to mediate Oct-2-dependent transacuvation (528). 

What is the critical feature of octamer motifs in Ig promoters 
that confers B-cell specificity when the same motif in an H2B pro- 
moter is active ubiquitously? Although the answer is still not 
known, one possibility is that apart from the TATA box, octamers 
in Ig promoters are not associated with other important promoter 
motifs that might allow ubiquitous expression in ubiquitously 
expressed genes. Consistent with this view is the observation that 
the insertion of a CCAAT promoter motif to an otherwise lym- 
phoid-specific promoter renders the promoter active in nonlym- 
phoid cells (529). It is also possible that — as in the case of several 
other coactivators for Oct-1, including VP 16 (530,531) and PTF 
(532) — sequences outside the octamer play a role in discriminating 
between Ig and other promoters. OCA-B may mediate some of this 
discrimination because OCA-B, when added to a HeLa-derived in 
vitro transcription system or when coexpressed in HeLa cells, 
could coactivate a construct driven by a Vk promoter much more 
effectively than a similar construct with an H2B promoter, even 
though both promoter sequences supported complex formation 
with OCA-B (520,521). Candidate motifs that might contribute to 
B-cell specificity of Vk promoters include sequences downstream 
from the transcription start site (532a). 

In addition to its role in V promoters, the octamer also appears 
in the H-chain enhancer, where it can contribute to the B cell speci- 
ficity of constructs transfected into various cell lines (533), 
although it did not appear critical for enhancer function in trans- 
genic mice (534). This octamer may activate the enhancer under 
certain conditions of cell stimulation (535) and clearly plays an 
important role when this region functions as a promoter driving 
sterile transcripts of the Cn gene (536). Additional DNA segments 
that are similar, but not identical, to the octamer have been found 
in several other regulatory regions of Ig genes, e.g., upstream of the 
mouse K intron enhancer (537). However, the functional impor- 
tance of most of these octamerlike motifs has not been demon- 
strated. The mechanism of regulation by the Oct proteins is likely 



to be considerably more complex than outlined here because of the 
existence of several iso forms resulting from alternative RNA splic- 
ing (538-540), several phosphorylation states critical to protein 
function (513,541), and the ability of both OCA-B and the Oct pro- 
teins to interact with other regulatory factors ((520,542,543). 

Other Elements of Ig V Promoters 

Although unusual Vk promoters that lack efficient octamci 
motifs have been reported to attain promoter activity through ar 
alternative pyrimidine-rich element designated kY (544), and < 
motif binding early B-cell factor may contribute to some Vk pro 
moters (545), Vk promoters are typically composed only of th< 
octamer plus TATA box. In contrast, V\ and VH promoters rou 
tinely contain additional functional elements besides the octame: 
and TATA box, some of which are briefly described below. In VP 
promoters a heptamer CTCATGA generally lying 2 to 22 bp 5 
from the octamer was found to be well conserved and required fo: 
optimum promoter activity (546). Surprisingly, although thi: 
sequence bears little resemblance to the conserved octamer ATG 
CAAAT, it appears to bind in vitro to both Oct-1 and Oct-1 
(547-551). The heptamer binds these proteins with lower intrinsi< 
affinity but shows cooperative interaction with occupancy of ai 
adjacent octamer site. Cooperativity also can be demonstrated at ; 
functional level by in vitro transcription experiments (551) 
Another element showing some sequence conservation in VH pro 
moters and a role in optimal promoter function is a polypyrimidini 
tract located between 0 and 46 bp upstream from the heptame 
(546). A motif that includes a polypyrimidine tract (GGAACCTC 
CCCC) has been identified as a component required for optima 
function of the MOPC141 VH promoter (552). This motif, whicl 
was designated the N element, was found to bind a novel tran 
scription factor of ubiquitous distribution. The relationshi] 
between the N element and the pyrimidine-rich kY motif is no 
known at present. A motif (TTANGTAA) conserved in many VI 
regions binds to C/EBP factors, originally identified as binding fc 
the E motif in the u. enhancer. This motif is required for optima 
transcription of VH promoter-driven transfected constructs i: 
vivo, and the purified binding protein stimulates transcription fror 
such promoters in vitro (553). One final VH element deserve 
mention. In an investigation of the mechanism by which treatmer 
with the lymphokine IL-5 plus antigen upregulates Ig H-chaL 
mRNA, Webb et al. (554) detected an A/T-rich element betwee: 
125 and 250 bp 5' from the VHS107 start site that could mediat 
increased transcription by these agents. In an EMSA experimer 
this element produced a band that was upregulated in extract 
obtained from cells treated with IL-5 plus antigen. In its A/T rich 
ness, the element resembles MARs, and a cloned protein corre 
sponding to this binding activity— designated B-cell regulator c 
IgH transcription (Bright) (555) — partitions partly with the insok 
ble chromatin matrix. The significance of the Bright binding sit 
for VH function is uncertain because most VH regions lack simih 
sequences within the 5' flanking region as far as has bee 
sequenced; furthermore, a transgenic construct driven by a relate 
VH promoter deleted for Bright binding was still expressed in 
lymphocyte-specific manner (556). Additional response elemeni 
may be discovered as the mechanisms of Ig transcription* 
response to other manipulations (including other lymphokines) ai 
investigated. To speculate further, the several VH promoter eh 
ments that are absent from VL promoters may facilitate the earl 
transcription of germline VH genes, allowing VDJ recombinatio 
to occur at a time when VL genes are transcriptionally inactiv< 



however much additional evidence would be necessary to support 
S a hypothesis. It is also possible that variations ,n the content 
or spacing of different elements in different VH P^ers may d,f- 
S S regulate V gene transcription, thereby 
frcTuency wi* which specific V regions are rearranged and «h- 

V* promoters n yu oter was 

To^Tc^^i:^ plus^an additional functional 
T^nUocated upstream from the octamer and not precisely con- 
1^^™****. CACGTGAC, is identical „ *« 

. , hv the orotein USF (upstream stimulatory factor) (560). 
St a ub^r^senpUn factor originally isolated based 
™ its role in regulating the major late promoter of adenovirus, bu 
° n e tnt foun^ to regulate a wide variety of cellular gene. It 

3S3£S2K3= 

^scriDtion was found to be reduced by passage of the 
:SS£K*USF antibody column but could be restored 
jSSditE. of purified USF CJW.Tl^ * 
murine VX2 promoter includes a functional USF motif. Little is 
gently Sown about the functional components of human VX 

- of other murine VX V™*™^™ 
Lm contain octamer-like sequences and TATA boxes. 



Promoters of Sterile Transcripts 



(274 562,563), Ck (305,564,565), or Cu (566,567) 
ntolly detected, and more recently sterile transacts of most of 
he downSream CH genes of the murine 

characterized. Salient features of the regulation of these transcripts 
are briefly outlined below. 

Sterile V-Region Transcripts 

whether sterile or mature— i.e., V(D)JC— transcnp isar^ 
Therefore, the earlier discussion of V-region ^P~moters p^ably 
applies to sterile transcript, It remains to be ^TT^^ 
i mechanisms allow the promiscuous 

transcripts during the developmental stage when V(D)J prearrange 
ments are occurring, but later shut off the promoters of all but the 
rearranged V regions (274,275). 

Sterile Cfl Transcripts 

Two types of sterile H transcripts have been described. Inuie , first 
type, transcription can initiate at heterogeneous position near <he 5 
2d of the JH Cu intronic enhancer (Em). When the ^ resulting RNA 
transcripts are spliced to the Cul exon, an Iuexon (^on-denved) 
remains attached to the RNA encoding Cu. The promoter of lu-Cu 
transcripts has been found to be coincident writ the Eu enhancer 
however, as mentioned above, the octamer motif plays a much more 
prominent role for this region as a promoter than it does ^as an 
enhancer (536). This promoter lacks a TATA box in both munne 
(536) and human (568) loci. Because the TATA box is generally 
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t>iR V ends of these transcripts. The i\i exon is remain 

the structure DJ-Cu after splicing out the JH-Cu umon. 
Scripts initiate from promoters that lie upstream of the germ- 
SeS I elements (570) but have not been fully characterized. For 
line DH elements y / v, upstream pro- 

the most JH-proximal munne D region .^572) In mice, the 

^tofD™ 

C^SiSS- m * expressed v reglons ' 

as described earlier. 

Sterile Ck Transcripts 

Two types of sterile Ck transcripts also have been described: an 
iwu lyiw^ initiates about 3.5 Kb 

8.4-kb primary transcript (564,5/3) tnai a4 7-kb 
upstream from JkI (and is processed to a U-kb RNA) and £7 
nrimarv transcript that initiates ust upstream from JkI (and s 
p^To/kb) (573). Both of these transenpts *e fc-dm 
ore R cells and are upregulated by exposure to LPS. The 5 nan* 
IT seoSnTes of initiation sites contain octamer-like 

remains to be explored. 



Sterile I-CH Transcripts 

The regulation of sterile transcripts of downstream CH regions 
cV^n* actively investigated because this regulation may be 
^ understanding the mechanism by which numerous 
X^T^o. of specific isotypes expressed in 
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particular immune responses (as described earlier, and discussed 
more fully in Chapter 23). In this view the promoter for the sterile 
transcript of a given isotype may be expected to contain a unique 
combination of motifs mediating the action of antigen, various 
cytokines, and other T-cell influences that are known to regulate 
switching to that isotype. The regulation of the IgE response has 
been studied intensively because of its clinical implications for 
allergy, and it may represent an illustrative example. 

The switching of B cells from the production of fi H chains to e 
is highly dependent on the cytokine IL-4, as demonstrated by the 
abolition of IgE synthesis in IL-4 knockout mice (577). In experi- 
ments in vitro, switching of splenic B cells to e production requires 
IL-4 in the presence of an additional signal that can be supplied by 
several mitogenic treatments, including LPS, anti-CD40, T-cell 
membranes, etc. In vitro production of £ H-chain protein is pre- 
ceded by synthesis of sterile Ie-Ce RNA transcripts — also known as 
germline e (Ge) transcripts — which initiate at multiple start sites, 
apparently owing to the absence of a TATA site in the promoter 
(578). Significantly, the Ie promoter can confer IL-4 inducibility to 
reporter gene constructs. The minimum sequence with this capacity 
contains binding sites for two known nuclear proteins (579). One of 
these is STAT6, a member of the family of signal transducers and 
activators of transcription, which transduce signals from many 
cytokine receptors to mediate transcriptional regulation. STAT6 is 
activated by engagement of the IL-4 receptor and is required for the 
IL^4 effect on Ie transcription and switching to Ce, as shown by 
experiments in cells lacking STAT6 and in STAT6 knockout mice 
(580-582). The other component required in the minimal IL-4 
responsive element of the Ie promoter is a binding site for the 
CAAT/enhancer binding protein (C/EBP) family of transcription 
factors. This family includes C/EBPa, expressed constitutively in 
liver cells, and NF-IL6 (C/EBPp), which mediates the action of LPS 
and inflammatory cytokines such as IL-1, tumor necrosis factor-ct 
(TNF-a), and IL-6. Another member of this family is the widely 
expressed C/EBPy (also known as Ig/EBP), which lacks a tran- 
scriptional activator domain but can act as a transdominant negative 
inhibitor of other C/EBP family members by heterodimerizing with 
them (583). Changing ratios of different members of this family in 
B-cell development contribute to regulated expression of VH pro- 
moters, and the intronic enhancers of the k and IgH loci (584). 

In addition to the STAT6 and C/EBP binding sites, two nearby 
motifs closer to the Ie initiation sites contribute to optimal IL-4 
inducibility of the promoter and also mediate the synergistic 
response of the promoter to CD40 engagement (585); these sites 
bind to the complex family of proteins known as NF-kB, which is 
described below in connection with the intronic K enhancer. Elim- 
ination of both NF-kB sites from the promoter inhibits IL-4 
inducibility, consistent with the absence of Ie sterile transcripts and 
switching to Ce expression in mice with targeted deletion of the 
NF-kB component p50 (586). 

An additional level of IL-4 control of the promoter apparently 
results from an A/T-rich sequence overlapping with some of the 
Ie transcriptional initiation sites. This sequence confers repre- 
ssion of the promoter, apparently due to binding of the chromo- 
somal protein HMG-I(Y) (587). IL-4 induces phosphorylation of 
this protein, perhaps thereby decreasing binding affinity and 
relieving the transcriptional repression (588). An additional com- 
ponent of the murine Ie promoter that contributes to basal activity 
but not to IL-4 inducibility is a binding site for the B cell-specific 
activator protein (BSAP) (585,589), a transcription factor de- 
scribed in more detail below. In the human Ie promoter, a BSAP 



site apparently enhances both IL-4- and CD40-mediated pro- 
moter activity (589a). Apart from the promoter, sterile Ie tran- 
scription and isotype switching are also regulated by an enhancer 
lying downstream from the murine Cot gene, as deduced from 
abnormalities in mice in which this enhancer was replaced by a 
neomycin resistance gene in all B cells (96): defects were observed 
in switching to IgE and several IgG isotypes (but not to IgGl), 
with corresponding decreases in sterile transcripts of the affected 
isotypes. 

Investigations of the regulation of sterile transcripts of other CH 
genes suggest promoters of similar complexity. In general, these 
promoters include motifs that act as response elements for signals 
known to promote switching to the respective isotypes; additional 
discussion can be found in Chapter 23. 

Cis Elements of Ig Gene Enhancers 

As pointed out above, enhancers are regulatory elements that 
stimulate transcription of nearby genes but, in contrast to pro- 
moters, can affect transcriptional initiation thousands of base 
pairs away and in either orientation. Enhancers have been found 
upstream and downstream from genes and, as is the case in the Ig 
genes, in introns. Although their position and orientation inde- 
pendence led to a variety of speculative models to explain how 
these sequences act to stimulate transcription, the prevailing view 
at present is that, like promoters, enhancers bind to nuclear pro- 
teins that facilitate assembly of a transcription initiation complex. 
As mentioned already, even enhancers that are positioned thou- 
sands of base pairs away from the initiation site in terms of linear 
distance on the DNA sequence can come close to the promoter 
simply by forming a large loop of DNA that doubles back on 
itself. Such loops have been observed by electron microscopy in 
several model systems. 

The discussion that follows focuses on murine enhancers, which 
have been studied in most detail. The human homologs are briefly 
described at the end of this section. 

Heavy-Chain Intronic Enhancer 

E-Boxes and Their Binding Proteins 

The enhancer located in the JH-C|i intron was one of the first 
cellular (nonviral) enhancers recognized, and it continues to be a 
target of intense study because of its remarkable complexity (as 
indicated in Fig. 19 A). In both the human and murine loci this 
enhancer, often designated E^i, lies about 0.5 kb 3' from the most 
downstream XH region and appears to be spread over about 0.3 kb. 
This segment is 5' from the \i S region and is thus routinely 
retained on the expressed gene after isotype switch recombination. 
A major effort has been made to analyze the mechanism of action 
of this enhancer by analyzing the component functional motifs and 
their binding proteins that mediate enhancer activity. 

In early work aimed at identifying the positions within the 
murine Ep. that might serve as binding sites for nuclear proteins, 
Church, Ephrussi, and colleagues (590,591) used an in vivo ver- 
sion of the footprinting method described above, examining the 
accessibility of the enhancer region to the methylating agent 
dimethylsulfate (DMS) in B cells versus nonlymphoid cells. These 
experiments located four clusters of nucleotides demonstrating B 
cell-specific alterations in DMS reactivity. These clusters defined 
a consensus octamer CAGGTGGC that appears not only at these 
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four positions in the \i intronic enhancer (designated \iE\ f fiE2, 
uE3, and u£4) but also at a fiftn position not apparent on the foot- 
print (u:E5) as well as at three positions within the mouse k 
enhancer (kEI, kE2, and k£3). These motifs have become known 
as E boxes or E motifs. 

The functional significance of E motifs in the enhancer has 
been tested by transfection of DNA constructs containing the 
enhancer with one or more of these motifs altered by deletion or by 
clustered mutations. Most of the E motifs were found to be func- 
tionally redundant in that constructs containing mutations in single 
E motifs or even in several pairs of E motifs still retained substan- 
tial enhancer activity (533,592), but a construct with mutations in 
|iEl, uE3, and u£4 showed loss of 98% of the normal enhancer 
activity. EMS A experiments established that nuclear proteins can 
bind specifically to the E motifs in vitro. Despite the sequence sim- 
ilarity of these motifs to one another, they do not all bind to the 
same proteins. Furthermore, although these E motifs were first 
noted as sites of in vivo alterations in DMS reactivity that were B 
cell specific, the proteins binding to these motifs were detected by 
EMS A experiments in nuclear extracts from a surprising variety of 
nonlymphoid sources. 

The identification and characterization of E motif binding pro- 
teins showed a complex regulatory mechanism involving evolu- 
tionarily ancient components. A central role is played by products 
of the E2A gene, including two forms — known as E12 and E47 — 
resulting from alternative RNA splicing (593). These proteins, and 
products of the related E2-2 and HEB genes, bind to the u£2, u£4, 
and uE5 motifs, as well as the similar kE2 motif from the k 
intronic enhancer. All of these proteins are members of the HLH 
family of transcription factors. This family, now including over 200 
proteins from species as diverse as drosophila, yeast, and plants, all 
share a common consensus binding motif — CANNTG — and the 
HLH domain: two 13-amino acid a-helices separated by an inter- 
vening loop. This structure mediates homo- or heterodimer forma- 
tion between members of this family. Such dimerization is neces- 
sary (but not sufficient) for DNA binding; and the binding 
specificity and affinity for particular E boxes depends on both 
members of the dimer pairs. As an example, proteins derived from 
the E2A gene associate with MyoD (or related muscle factors) to 
form heterodimers that bind with high affinity to E box-like motifs 
in promoters of muscle-specific genes (594-596). Apparently 
E2A-encoded proteins, which are expressed virtually ubiquitously, 
may participate in tissue-specific gene regulation by binding with 
HLH partners with narrower tissue distribution. However, the B 
cell-specific E2A factor (known as BCF-l) is a homodimer of E47 
subunits, which for unknown reasons seems to form uniquely in B 
cells despite the wide tissue distribution of E47 (597,598). Besides 
the HLH dimerization structure, most HLH proteins (including the 
E2A products) contain an additional element that is necessary for 
DNA binding: a segment of basic amino acids adjacent to the HLH 
on its N-terminal side, hence the designation bHLH for the sub- 
class of proteins that have this basic region. In fact, HLH proteins 
that lack this segment — e.g., the Id group of HLH proteins 
(599,600)) — apparently serve as physiologic inhibitors of E-motif 
function by dimerizing with HLH proteins and preventing their 
DNA binding. An additional component of some HLH proteins is 
a leucine zipper; this is an a-helical structure with several leucines 
at seven-residue intervals such that they all project from the same 
side of the helix. HLH proteins that include leucine zippers 
(bHLH-zip proteins) can dimerize to each other via hydrophobic 
interactions between the leucines, but they cannot dimerize to HLH 



proteins lacking the zipper component (like Id, to which they are 
thus resistant). The bHLH-zip proteins include the Myc proteins 
(and their heterodimer partner Max) as well as three proteins pre- 
sent in B-cell nuclei that can bind to \iE3 and kE3: USF, TFE3, and 
TFEB. 

With this background on the HLH proteins, several features of 
their regulatory role in B cells may be considered. The critical 
importance of E2A proteins for B-cell development is highlighted 
by the phenotype of mice in which this protein was disrupted by 
homologous recombination (601,602) Homozygous mutant mice 
develop to term, but most die within a few days of birth. Strikingly, 
the mice fail to generate any B-lymphocytes, although the T-cell 
compartment is grossly normal as are other tissues like muscle in 
which participation of E2A proteins as heterodimers has been doc- 
umented. Perhaps products of the related E2-2 gene are able to 
compensate for lack of E2A in muscle but not in the B-lineage. 
When transgenes expressing E-12 or E47 transcripts were added to 
the E2A knockout mice, a synergistic action of these two tran- 
scripts in B-cell development was apparent (603). Although 
impairment of Ig E box-dependent enhancer function may con- 
tribute to the E2A knockout phenotype — which includes marked 
inhibition of Iu, transcription and DJH recombination in fetal 
H ver — this is difficult to establish because of the dramatic reduc- 
tion in other transcripts important for B-cell development, includ- 
ing RAG-1, mb-1, CD 19, and X5. Transcription of the latter two 
genes is known to be regulated by the transcription factor BS'AP, 
and expression of the gene for BSAP was also significantly 
reduced. Thus, impaired interactions between E2A proteins and E- 
box motifs in E|i may not contribute significantly to the knockout 
phenotype. Other evidence for a role of E2A products in regulating 
Ig genes comes from experiments in which an E47 expression vec- 
tor was transfected into a T-lymphocyte line (604); this caused a 
dramatic upregulation of Iu, transcription and DJH rearrangement 
(although some indirect effects may play a role in this system 
because expression of Oct-2 and both RAG genes was observed to 
be increased). E47 overexpression also caused Iu, transcription in a 
transfected fibroblast line (605). Regulation of E47 may be modu- 
lated by phosphorylation in non-B cells, which may reduce the 
ability of this protein to bind to DNA in the B cell-specific homod- 
imer form (606). 

E2-2 and HEB are similar to E2A in structure and in that they 
are expressed in many different cell types, but their roles in Ig 
expression have been less well studied. Early in B-lineage devel- 
opment E2-2 is more highly expressed than E2A and probably con- 
tributes more to E-box binding (607). Homozygous knockouts for 
E2-2 or HEB (608) showed unexplained perinatal lethality similar 
to that observed in E2A knockouts, but only a modest decrease in 
proB cell numbers; so these genes are clearly are less important for 
B-cell development than is E2A. 

The complexity of the function of E boxes in the u. enhancer is 
illustrated by investigations on the function of a small fragment of 
the enhancer containing only u£5, JJ.E2, and |lE3 (609,610). A 
tetramer of this fragment is sufficient to induce enhancer activity 
in constructs transfected into a B cell. In this enhancer fragment, 
U.E3 mediates a significant part of the enhancer function, and the 
protein that best mediates this effect is the \iE3 binding protein 
TFE3 (61 1). Indeed, B cells lacking TFE3 show reduced activation 
of Ig secretion (612). In contrast to the activity of the tetramerized 
uE5-|iE2-uE3 fragment in B cells, this construct showed no activ- 
ity in fibroblasts; but a similar tetramer lacking |xE5 was active in 
both B cells and fibroblasts, suggesting that U.E5 confers inhibition 
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FIG. 19. Four murine Ig gene enhancers. A: The H-chain enhancer is located between two Xba I sites in the JH-Cu. 
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lines). The expanded scale at the bottom shows the central motifs with their names, DNA sequence (rectangles) and 
associated binding proteins, where known (cartouches). 



in non-B cells. This inhibition is apparently mediated by the bind- 
ing of a non-HLH protein designated ZEB to the U.E5 site, allow- 
ing inhibition of TFE3. Inhibition by ZEB can be partially reversed 
by overexpression of the E47-like protein ITF-1, which binds to 
u,E5 and displaces ZEB. Competition between E2A products and 
ZEB may similarly contribute to the B-cell specificity of the E|X 
during B-cell development. 

Another mechanism by which Eu, activity mediated by HLH pro- 
teins may be regulated is through the Id proteins Idl, Id2, Id3, and 
Id4. As mentioned above, the Id proteins are dominant negative 
regulators of bHLH proteins because they can heterodimerize to 
these transcription factors, and prevent them from binding to their 
cognate motifs in DNA (613). Idl and Id2 are expressed in pro-B 
cells at a time when E2A proteins are expressed but are not 
detectably bound to E boxes. Later these Id proteins are downreg- 



ulated, apparently allowing bHLH activation of their target regula- 
tory regions (599,600). The model that Id expression can suppress 
bHLH activation is supported by the phenotype of mice with an Idl 
transgene that was designed for late B-cell expression using an mb- 
1 promoter and the E\i enhancer (614); these mice showed a 
marked impairment in B-cell development very similar to the E2A 
knockout mice described earlier. 

A clinically important aspect of the E2A proteins is the capacity 
of their genes to participate in oncogenic transformation as a con- 
sequence of translocations that fuse parts of these genes with for- 
eign genetic material from a different chromosome. The human 
chromosomal locus of the E2A gene on 19p 13 is the site of at least 
two classes of translocation events in acute lymphocytic 
leukemia— t(l;19)(q23;p 13) and t(17;19)(q22;pl3)— that produce 
oncogenic E2A fusion genes (615,616). 
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ETS Family Members and Their Role in Ejl 

Members of another large family of transcription factors partic- 
ipate in Eu. regulation: the ETS proteins, which bind as monomers 
to the sites in Ep known as |iA (also called 7i) and pB (617) 
(Fig. 1 9 A), as well as to similar sites in the other Ig enhancers. ETS 
family members share a conserved 85-amino acid DNA binding 
domain that recognizes DNA motifs generally containing a core 
GGAA sequence. Mutations at the pA site in transfected reporter 
constructs suggest that its integrity is crucial for E|X activity in pre- 
B cells but not later in B-cell development (618). Several widely 
expressed ETS proteins, including Ets-1, Ets-2, Erp, and NERP, 
can bind to pA, although some evidence favors a physiologic role 
for Elf-1 at this site (619). In contrast, the pB site binds primarily 
to PU. 1 , which is expressed only in B cells and macrophages (620). 



PU. 1 appears to be critical for regulation of all three Ig loci and 
several non-Ig genes, including the mb-l and J-chain genes. The 
importance of this protein for lymphoid development has been doc- 
umented by PU.l knockout mice; these mice die before birth, 
showing profound defects in lymphoid and myeloid lineages, 
although not in erythroid or megakaryocyte cells (621). Although 
neither pA, ui5, nor the intervening p£3 show substantial enhancer 
activity by themselves when multimerized, the fragment containing 
these three adjacent motifs does show enhancer activity in B cells 
(617), and the spacing between the pA and |iB elements is critical 
for activity (622). These findings as well as systematic studies of 
the in vivo and in vitro interactions between the proteins binding 
uA, U.E3, and u.B elements suggest that activity of this minimal 
enhancer depends on complex mutual interactions between these 
proteins and the Ep. DNA (623,624). This minimal enhancer shows 
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activity in macrophages as well as B cells, but inhibitory influences 
from flanking E boxes inactivate the enhancer in macrophages, fine 
tuning the cell type specificity of the enhancer. Similarly complex 
interactions have been reported between the u\E2, uA and u£3 
motifs and their binding proteins (624a). 

Other Motifs in EjJ. 

A site in the \i enhancer known as E (unrelated to E boxes) also 
has been shown to be necessary for optimal enhancer activity 
(624b,624c). This site binds to members of the C/EBP family (dis- 
cussed above), which in B cells is represented by Ig/EBP (C/EBPy) 
and NF-IL6, with the latter increasing as B-cell development pro- 
gresses (584). IL-6 is known to upregulate Ig secretion in B-cell 
lines (625), and the E site is a likely (although not yet documented) 
participant in this regulation through activation of NF-IL6. 

Despite the initial designation of jiEl as an E box, it lacks key 
criterion now implied by this term, namely the canonical CAN- 
NTG motif that seems necessary for HLH protein binding. Instead, 
the jiEl site is apparently bound by a protein known as YY1 (626) 
(YinYangl), so called because it can mediate either positive or neg- 
ative effects on gene expression, depending on circumstances. This 
protein also binds to a similar site in the 3' K enhancer and partic- 
ipates in the regulation of a wide variety of genes in many tissues. 
YY1 has four zinc fingers (which are zinc-chelating domains 
found on a subset of DNA binding proteins) and both positive and 
negative regulatory domains (627,628). The function of this protein 
in Eu, has not been fully elucidated; mutations of the site decreased 
enhancer activity in plasmacytoma cells (533) but had no effect in 
other conditions (629). 

Matrix Attachment Regions Flanking Ejj. 

Several factors apparently contribute to the B-cell specificity of 
the \l enhancer. As discussed above, two motifs in the enhancer bind 
to B cell-specific nuclear factors QiB and the octamer), and one 
motif QiE5) can inhibit function of U, enhancer fragment constructs 
in fibroblasts. However, several reports have suggested that an addi- 
tional measure of B-cell specificity is conferred by sequences flank- 
ing the enhancer that suppress the activity of the central motifs in 
nonlymphoid cells (630-632). These suppressive sequences overlap 
with A/T-rich sequences that flank the core enhancer and have the 
properties of MARs (633). As discussed earlier, MARs are believed 
to represent the sites where DNA is tethered to the insoluble protein 
scaffold of the nuclear matrix, and they have been found near sev- 
eral enhancers (including the K intron enhancer) and associated with 
one VH gene. A nuclear protein designated NF-U.NR (nuclear fac- 
tor-u, negative regulator), detected in several cell lines not express- 
ing Ig, has been shown to bind to four A/T-rich jiNR elements that 
lie with one pair in each of the MARs flanking the central enhancer 
region (Fig. 19 A). In transient transfection assays, deletion of these 
UNR elements from enhancer constructs had little effect on tran- 
scription in B cells, which do not express NF-U.NR; but in 
macrophages and T cells, which do express NF-U.NR and cannot 
support activity of the intact enhancer, deletion of the |±NR ele- 
ments activated the enhancer, apparently releasing it from inhibition 
mediated by NF-u\NR (634). This suggested that binding of the 
MARs to the matrix might be necessary for optimal enhancer func- 
tion and that NF-|iNR might inhibit this interaction in non-B cells. 
In support of this model, a matrix protein designated MAR-BP 1 
that might mediate this interaction has been purified from urea-sol- 



ubilized matrix and has been shown capable of binding to the F41- 
associated MARs; in accordance with the model, this interaction 
was inhibited by purified NF-uKR (635). 

Although the MARs flanking Eu, have been found to contribute 
little to enhancer activity in constructs transfected into B cells, they 
are apparently important for Eu,-driven transcription in transgenic 
constructs (636). When flanked by its MARs, the Eu\ demonstrates 
a defining property of an LCR: it confers position-independent tran- 
scription on transgenic constructs integrated at various positions in 
the genome. Other MARs have been reported to demonstrate this 
property (637), but the exact relationship between MARs and LCRs 
is not yet understood. One possibility is that MARs may act by 
relieving superhelical strain because they correlate with sequences 
capable of becoming unpaired and nucleating unwinding (479). 

There is good evidence that enhancers can stimulate transcrip- 
tion by approximating their binding proteins to promoter-binding 
proteins, looping out the intervening DNA and forming a three- 
dimensional transcription factor complex that facilitates formation 
of a transcription initiation complex. To assess whether enhancers 
can mediate changes in chromatin structure apart from this pro- 
moter-enhancer interaction mechanism, B-cell nuclei from mouse 
strains harboring a variety of transgenic Eji constructs without 
linked eukaryotic promoters were tested for access to DNase and 
prokaryotic T3 or T7 polymerases (638,639); the constructs con- 
tained promoters for the same polymerases, and some contained an 
MAR from the E\x flanking region. A minimal Eji enhancer was 
found to mediate local factor accessibility, but a MAR was required 
to extend the accessibility to a promoter 1 kb away from Eu,, imply- 
ing that MARs can collaborate with an enhancer to generate a 
domain of chromatin accessibility even without specific interac- 
tions between enhancer- and promoter-bound proteins. 

Additional insights on MAR function have been gained by study- 
ing several other proteins — besides NF-jiNR — that can bind to 
MARs, including SATB1 (640), nucleolin (641) and Bright, as men- 
tioned above (555). But considerable additional work will be required 
for a comprehensive understanding of how these elements function. 

The 3 ' a Enhancer and LCR 

Discovery of the Enhancer Complex 

A complex regulatory locus has been reported to lie downstream 
from the murine Ig H-chain Ca gene. The existence of a regulatory 
region in this location was originally inferred when it was found 
that plasmacytomas that had undergone spontaneous deletions of 
En nevertheless remained capable of high-level Ig secretion 
(642-645). Conversely, a myeloma subclone that retained the 
intronic enhancer but lost a segment of DNA downstream from the 
murine Ca gene was found to have markedly reduced H-chain gene 
expression (646). A systematic search in the homologous region of 
the rat H-chain locus showed an enhancer (647), and a homologous 
mouse enhancer designated 3'aE was found soon thereafter 
(648,649) positioned about 16 kb downstream from Ca. The 
mouse and rat 3'aE segments lie in opposite orientations and are 
flanked by inverted repeats (648). In addition to the 3'aE, Matthias 
and Baltimore also reported a weak enhancer in mouse lying only 
4 kb downstream from Ca (650). 

More recently, Madisen and Groudine (490) analyzed B 
cell-specific DNase I hypersensitivity downstream from Ca and 
detected four hypersensitive sites: HS1 and HS2 overlap the previ- 
ously described 3'aE, whereas HS3 and HS4 lie further down- 
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stream and identify two new regions with somewhat weaker 
enhancer activity in transient transfection assays. The HS3 
sequence is almost identical to that of the enhancer described by 
Matthias and Baltimore but in inverted orientation. This reflects the 
fact that the sequence surrounding the HS12-3'aE is present in the 
mouse in a long inverted repeat (651,652) (Fig. 19B). When con- 
structs containing HS3, HS12, and HS4 linked to a reporter gene 
were transfected into a B-cell line, subsequently isolated stable 
transfectants were found to express the reporter gene in a position- 
independent manner. This suggested that the three enhancer 
sequences (HS12, HS3, and HS4) acted together as an LCR. 

Component Motifs of the Enhancer Complex 

Analyses of the regulatory regions downstream from murine Cot 
have identified several motifs that bind specific transcription fac- 
tors to mediate different aspects of regulation of enhancer function. 
The 3'otE has been found to activate transcription strongly in plas- 
macytomas, but only weakly in earlier B-lymphoid cells. Part of 
this developmental change is attributable to a motif known as E5, 
which matches the E-box consensus binding site — CANNTG — 
characteristic for members of the bHLH family of transcription 
factors. The contribution of the E5 site to enhancer activity is 
inhibited in early stages of development by the dominant negative 
nuclear regulator Id3,. which is expressed in early B-lineage cells 
but downregulated in plasma cells (653). At least four other motifs 
in the 3'ocE have been reported to contribute to enhancer activity in 
plasmacytomas but not in early B cells. One site known as otP 
binds to a member of the ETS family of transcription factors des- 
ignated NF-aP (654). Another is the octamer motif (ATGCAAAT) 
common to Ig V-region promoters and several Ig enhancers (655). 
A third is a KB-like site that binds to members of the NF-xfl/Rel 
family of transcription factors described below (653). The fourth is 
a G-rich sequence whose function has been demonstrated by muta- 
tional analysis but for which binding proteins have not been iden- 
tified (656). Activity of all four of these sites appears to be regu- 
lated by the product of the Pax-5 gene known as BSAP, mentioned 
earlier. This protein, which binds to two motifs in the 3'aE, is pres- 
ent in early B-lineage cells, in which it suppresses enhancer activ- 
ity; but its loss in plasmacytomas relieves the suppression. In most 
contexts BSAP is a transcriptional activator, but in the 3'ctE it 
inhibits the enhancer activity in at least two different ways. First, it 
prevents the binding of the transcriptional activator NF-aP to the 
aP site; second, it causes the octamer, G-rich, and KB-like motifs 
to exert an active repressive influence on transcription (654,656). 
Optimal activity of the kB site in the 3'ctE may require interactions 
between NF-kB and proteins binding to an adjacent motif desig- 
nated NFE (657). 

Apart from the motifs mediating upregulation of the 3'aE dur- 
ing maturation to plasma cells, a response element in the enhancer 
for activation induced by BCR cross-linking has been traced to par- 
tially overlapping sites for the ETS family member Elf- 1 and for 
members of the AP-1 transcription factor family (658). The same 
DNA sites represent a response element for CD40, though perhaps 
mediated by slightly different members of AP-l/ETS families 
(659). Two other motifs in the enhancer have been proposed to con- 
tribute to its regulation, but are less well documented: the |i£l and 
the |iB motifs, which were first noted in the rat 3'aE and which are 
partially conserved in mice. The HS3 and HS4 enhancer regions of 
mice have been studied in less detail, but the HS4 enhancer appar- 
ently contains functional Oct-1 and BSAP binding sites (660,661). 



A role for the 3'aE in isotype switching was suggested by exper- 
iments in which this region was replaced by a neomycin resistance 
gene (neo r ) through homologous recombination in ES cells that 
were then used to reconstitute the B-cell population in RAG -2 
knockout mice. The resulting B cells showed normal V(D)J recom- 
bination but marked deficiencies in switching to IgG2a, IgG2b, 
IgG3, and IgE in vitro, whereas expression of IgM and IgGl was 
normal (96). This observation suggests that the enhancer exerts iso- 
type-specific effects on switch recombination, possibly by affect- 
ing the extent of germline transcription of the different isotypes 
before switch recombination. One caveat follows from the obser- 
vation that, in neo r replacement experiments to test the role of the 
K enhancers, the neomycin resistance gene apparently affected k 
expression beyond simple deletion of the enhancer (277,278); a 
similar effect of the neo r gene may contribute to the phenotype 
observed in the 3'aE replacement mice. 

Roles of the Two IgH Enhancers in B-Cell Development 

From the experiments analyzing the Eu, and 3'aE, it would 
appear that E(i functions primarily in early in B-cell development, 
with the 3'aE functioning later. Thus, B cells with targeted neo r 
replacement of the Eu. enhancer (in chimeric RAG complementa- 
tion mice) showed cis inhibition of germline transcription and VDJ 
recombination (662,663), whereas the similar 3'aE replacement 
just discussed showed normal VDJ recombination. The latter con- 
struct affected isotype switching, characteristic of late B-cell mat- 
uration. Moreover, as discussed above, spontaneous deletions of 
Eu, in plasmacytomas did not significantly affect Ig secretion, 
whereas spontaneous (646) or targeted (664) deletions removing 
the 3'aE depressed Ig secretion (in the latter study Efi was also 
missing). Transfected and transgenic constructs driven by the 3'aE 
indicate that activity of this enhancer is specific to late, activated B 
cells (665-667), perhaps due in part to suppression by BSAP and 
Id3 early in the B-lineage (as discussed above) as well as to stimu- 
lation through motifs in the enhancer that are specifically respon- 
sive to antigen binding, T-cell stimuli, or mitogens such as LPS. 

The K Intro n Enhancer and NF-kB 

By transfecting deleted K genes and constructs containing seg- 
ments from the murine Jk-Ck intron linked to reporter genes, sev- 
eral groups demonstrated an enhancer lying about 0.7 kb 5' from 
the CK-region gene (668,669). This location corresponds to a B- 
cell specific DNase I hypersensitivity site (492,493,670) and also 
to a segment of the intron noted to have a remarkably high degree 
of sequence conservation between mice, humans, and rabbits 
(671). The intronic K enhancer, sometimes designated iEK, has 
been dissected by fine deletions, mutations, and protein binding 
studies (Fig. 17C). As mentioned above, three E boxes were recog- 
nized in this enhancer, and they seem to bind to the same proteins 
targeted to the corresponding Eu. motifs: kEI binds to YY1, kE2 
binds to the E2A proteins, and kE3 binds to TFE3 and related pro- 
teins. E-box mutations that reduce enhancer function abolish pro- 
tein binding at this motif (533). 

NF-kB 

An additional motif of major importance in iEK is the kB motif 
GGGACTTTCC. This motif was originally discovered as a binding 
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site for a nuclear protein, detectable in EMS A experiments, in 
extracts from B cells— hence its designation NF-kB (nuclear fac- 
tor-Kfi) (672). The presence of this protein in the same cells capa- 
ble of supporting k enhancer function provided a clue that NF-kB 
might be important for mediating enhancer activity. A further cor- 
relation was provided by the pre-B line 70Z/3, which has a 
rearranged but functionally silent K gene. Treatment of 70Z/3 cells 
with LPS causes an activation of k transcription associated with the 
appearance of NF-kB activity in 70Z/3 nuclear extracts (673). 
Mutations in the kB . motif strongly reduce K enhancer activity 
(533), suggesting a critical role in enhancer function. Indeed, this 
motif in isolation from the E-box motifs has been shown to possess 
enhancer activity, especially in constructs containing tandem 
copies of the motif (674). The enhancer activity of these constructs 
was much greater when transfected into a B-lymphocyte than into 
a fibroblast, consistent with the importance of this motif for the B- 
cell specificity of K gene expression. An important physiologic role 
for NF-kB in mediating B-cell activation triggered by antigen 
recognition is suggested by the ability of surface IgM cross-linking 
to upregulate NF-kB activity (675). Since the discovery of the kB 
motif in LEk, similar motifs have been recognized as critical func- 
tional elements regulating numerous genes outside the Ig loci. 
These include the genes encoding MHC class I and class II proteins 
and ^-microglobulin, urokinase, IL-2 and IL-2 receptor a chain, 
1L-6, granulocyte-macrophage colony-stimulating factor, f*-inter- 
feron, inducible nitric oxide synthetase, and TNF-a and -ft the 
motif is also found (in tandem repeated copies) in the long termi- 
nal repeat of HIV. Many of these genes are expressed outside the 
B-lymphoid lineage; and indeed NF-kB was found to be inducible 
in T cells and HeLa by phorbol esters (673) and in other cell types 
by a variety of agents, including LPS, phorbol esters, TNF-a, and 
1L-1. Thus, rather than being B cell specific, NF-kB proteins act in 
many cell types, often regulated by agents associated with inflam- 
mation and often regulating inflammation-related responses. In B 
cells the protein regulates not only iEK, but also the 3'ctE and sites 
•in the promoters of I regions of several H-chain isotypes, as dis- 
cussed above. The complexity of NF-kB and its clinical relevance 
to various immune processes have inspired considerable investiga- 
tion, and a large body of literature has resulted on the molecular 
basis of NF-kB action. 

In general, the induction of NF-kB activity is not blocked by 
protein synthesis inhibitors, suggesting that activation must be due 
to modification of a preexisting protein molecule. Indeed, most 
cells that lack NF-kB activity in their nuclei have an inactive cyto- 
plasmic form that is unable to bind to kB sites but can be activated 
when cytoplasmic extracts are treated in vitro with sodium deoxy- 
cholate (DOC) (676). DOC was found to work by abolishing the 
binding of an inhibitory subunit designated IkB to NF-kB (677). 
The active form of NF-kB initially studied is a complex of two dis- 
tinct subunits of molecular weights of 50 and 65 kDa, known as 
p50 and p65, respectively. Thus, the initial experiments suggested 
a model of the inactive cytoplasmic NF-kB as a complex of p50, 
p65, and IkB; physiologic activators of NF-kB would abolish IkB 
activity, releasing the p50-p65 heterodimer to move to the nucleus, 
bind to kB sites, and activate transcription. 

The cloning of genes encoding proteins of the NF-kB has pro- 
vided a more complex and interesting picture (678). The p50 sub- 
unit and a closely related p52 protein are encoded as precursor pro- 
teins known as NF-kB1 (pl05/p50) and NF-kB2 (pl00/p52). The 
N-terminal half of both proteins share a domain of about 300 
amino acids that is responsible for DNA binding. This domain is 



shared by an ancient family of evolutionarily conserved proteins 
with functionally homologous members in drosophila. The family 
and the domain are designated Rel, after the v-re/ oncogene from 
the avian reticuloendotheliosis virus, an early recognized family 
member (679). The C-terminal halves of both NF-kB 1 and NF- 
kB2 contain seven repeats of a 33-residue sequence known as the 
ankyrin repeat; these repeats are found in diverse proteins from 
bacteria to mammals and often mediate protein-protein interac- 
tions. The precursor forms of NF-kB 1 and NF-KB2 are inactive 
and restricted to the cytoplasm because a nuclear localization 
sequence (NLS) is occluded by the C-terminal domain; the protein 
is activated by proteolytic cleavage of this domain, which uncovers 
the NLS of p50 (or p52). The other members of the Rel family cur- 
rently known contain the Rel homology domain without an 
inhibitory domain; they instead contain transcriptional activation 
domains. These proteins include RelA (which encodes p65), c-rel 
(the cellular homolog of the v-re/ oncogene), RelB, and the two 
drosophila proteins Dorsal and Dif. Interacting through their Rel 
domains, various members of the Rel family can form het- 
erodimers or homodimers, some showing slight differences in pre- 
ferred DNA binding motif. Different members function predomi- 
nantly at different stages of B-cell development; in pre-B cells (and 
in non-B cells) p50 and p65 are seen, with p50 and c-Rel in mature 
B cells and LPS-treated pre-B cells, and p52 and RelB in plasma- 
cytoma cells (657,680-682). Despite the apparent importance of 
the p50 subunit in B-cell development, p50 knockout mice have 
grossly normal k Ig expression with a normal ratio of k to X L 
chains, possibly because of compensation by other Rel family 
members. However, these mice show impairments in activated Ig 
production (586) and in expression of certain switched isotypes, 
including IgGl, IgA, and IgE (114), as mentioned earlier. 

Cloning of IkB genes (683-685) showed another family of pro- 
teins, all of which share the ankyrin repeats characteristic of the 
C-terminal half of NF-kB 1 and NF-kB2 precursor proteins. This 
family includes IkBcc, IkB|}, bcl3, and the drosophila protein cac- 
tus. Bcl3 is unusual in that when it binds to p50 or p52 homod- 
imers it promotes nuclear localization (686) and stimulates tran- 
scription through an activation domain (687,688); this protein is 
not directly involved in NF-kB function in the IgK enhancer. IkBo. 
or IkBP can bind to Rel family dimers, retaining them in inactive 
form in the cytoplasm. Many of the same stimuli that cause prote- 
olytic removal of the ankyrin repeat-containing C-terminal half of 
the Rel precursor proteins also activate Rel dimer-lKB complexes 
through a similar mechanism. A critical step is phosphorylation of 
the IkB (689), apparently by an iKB-specific kinase (690,691, 
691a). The phosphorylated protein is a target for addition of the 
small protein ubiquitin, a modification that flags proteins for 
destruction by proteasomes (692). IkBcx and IkB(5 show somewhat 
different preferences for specific Rel dimers, different tissue dis- 
tribution, and an interesting difference in regulation. The IkBcc 
promoter contains multiple kB sites, which are activated when 
IkB destruction releases NF-kB, allowing this protein to migrate 
to the nucleus and stimulate IkB<x resynthesis, which causes inhi- 
bition of NF-kB. Thus, in cells where IkBcc predominates, NF-kB 
activation is short lived (693). In contrast, IkBP is not regulated 
in this way, so in cells where it predominates, NF-kB activation 
may be more prolonged (685). The importance of NF-kB for k 
expression was supported by transfection of AMuLV pre-B cell 
lines with an engineered form of IkB capable of suppressing both 
RelA and c-Rel; this dual block markedly inhibited germline K 
transcription and rearrangement (694). 



162 / Chapters 



A MAR and Silencer Elements Upstream from iEK 

Upstream from iEK lies an A/T-rich region that has been identi- 
fied as a MAR (695). Kappa gene constructs inserted into the 
mouse genome as transgenes or stably integrated into B cells 
demonstrated that this MAR contributes to transcription of the 
associated gene (696,697) and to associated demethylation (698). 
[The demethylation also appears to be regulated by NF-kB (699).] 
Within the MAR an API binding site has been reported (700). This 
site appears to be required for optimal enhancer activity when 
transfected into LPS-treated pre-B cells and mature B cells and 
shows LPS-inducible binding in EMSA experiments. In transfec- 
tions into HeLa and T cells, a 232-bp fragment 5' from the KB 
sequence inhibited expression of a linked gene, whereas this frag- 
ment did not affect expression in B cells (701). Thus, this fragment 
may contain a gene silencer that is active in non-B cells and con- 
tributes to the B-cell specificity of the enhancer. A shorter element 
designated kNE (negative element) lying just upstream from kB, 
and conserved at this position in humans and rabbits, has been 
reported to inhibit enhancer activity (702); this inhibition was 
reversed in a B cell— specific manner by another element a few base 
pairs upstream. 

The 3' k Enhancer 

Components of the Enhancer 

As in the H-chain system, the search for a second enhancer 
downstream from the Ck gene was inspired' by a cell line whose 
expression of its endogenous k gene seemed difficult to explain in 
terms of the known intronic enhancer. Thus, the myeloma SI 07 
was found unable to support transcription of transfected constructs 
driven by the K intronic enhancer because this line lacks NF-kB 
activity, yet it is able to transcribe its endogenous k genes (703). A 
search 3 ' from the Ck gene showed a second enhancer about 9 kb 
downstream from Ck (704). This enhancer is about sevenfold 
stronger than the K intronic enhancer and in transfection experi- 
ments is B cell specific. Inclusion of the 3' enhancer in transgenic 
constructs leads to more than 20-fold higher transgene expression 
than observed with constructs lacking this sequence (705). Like the 
iEK, the 3'Ek can be activated in pre-B cells by LPS. The func- 
tional elements of the murine enhancer have been dissected, 
demonstrating a complex set of motifs mostly clustered in a 132- 
bp core enhancer (Fig. 19D). One important motif identified by 
deletion and multimerization constructs contains the sequence 
C ATCTGTT, which conforms to the CANNTG consensus for HLH 
binding motifs; indeed, this motif appears to bind to such a protein 
because the activity of multimerized versions of this motif, as well 
as the activity of the entire enhancer, can be inhibited by the HLH 
protein Id described above (706). The second principal motif binds 
PU. 1 , a member of the ETS domain family of transcription factors 
described earlier in this section. The binding of PU.l to the 
enhancer recruits a second B cell-specific protein — designated 
PU. 1 interaction protein or Pip (formerly NF-EM5) — which binds 
to an adjacent DNA segment whose integrity is necessary for full 
enhancer function (707,708). Pip is homologous to members of the 
interferon regulatory factor (IRF) family of transcription factors, 
and its binding to PU. 1 is also important in the murine X enhancers 
described below. The Pip-PU.l interaction requires phosphoryla- 
tion on a particular serine residue of PU.l, suggesting the possibil- 
ity that the degree of phosphorylation might contribute to physio- 
logic regulation of enhancer activity. 



Upstream from these two motifs lies a sequence that was found 
by mutation analysis to be necessary for maximal enhancer activ- 
ity and which was found, by Xgtl 1 library screening, to bind to the 
transcription factors ATF-1 (activating transcription factor) and 
CREM (cyclic AMP response element modulator) (709). Both of 
these proteins can bind to PU. 1 in vitro. The ability of CREM to 
function in the enhancer is supported by the observation that dibu r 
tyryl cAMP can increase the 3'Ek enhancer activity. The DNA 
binding motif for these factors also corresponds to an AP- 1 site, 
and the components of AJN 1 , c-Fos, and c- Jun were also found to 
activate the enhancer through this motif (710). Indeed these two 
AP- 1 subunits can participate in a higher order complex with PU. 1 
and Pip that is detectable biochemically by EMSA and functionally 
by synergistic activation of the enhancer by these proteins when 
expressed in fibroblasts, in which the enhancer is normally silent. 
A site detected by in vivo footprinting that is occupied in pre-B and 
B cells, but not in plasma cells, was shown in EMSA experiments 
to bind a protein with characteristics of BSAP (711). At about 90 
bp upstream of the core enhancer is an additional site that binds the 
transcription factor Spl and that is required for maximal enhancer 
activity in some constructs (712). 

Just downstream from the murine 3' K enhancer is a negative 
regulatory region that seems to suppress the activity of this 
enhancer in pre-B cells (626). One component of this region 
appears to be a binding site for the zinc-finger protein YY1 or NF- 
E 1 discussed earlier in the context of this protein's function in Eu.. 
An additional component may be a binding site for lymphocyte 
enhancer factor- 1 (LEF-1), an HMG-related protein that binds in 
the minor groove of DNA and causes DNA bending; this protein is 
also a known component of TCR-a enhancer regulation. Binding 
activity at the presumptive LEF-1 site was depressed by treatment 
with LPS, suggesting the possibility that this effect may explain 
how the 3 ' Ek enhancer activity might be upregulated by LPS in the 
absence of a site for NF-kB (713). 



Roles of the Two k Enhancers in B~Cell Development 

The intronic K enhancer seems critical for supporting Vk-Jk 
recombination because targeted disruption of the enhancer by neo r 
replacement severely impairs or abolishes such recombination 
(277,714). Based on transfection experiments suggest that this 
enhancer is moderately active in pre-B cells but can be upregulated 
with activating agents such as LPS and phorbol esters, which exert 
their effects through NF-kB activation. Although the enhancer is 
active in plasmacytoma cells, its integrity does not seem critical for 
k gene expression at this stage, when the 3'Ek is more active; how- 
ever, some transfection experiments suggest that the two enhancers 
may function synergistically in mature B and plasma cell stages 
(7 1 5,7 16). Even at the pre-B stage the 3 'Ek is active and can support 
lineage-specific expression of a transgenic k gene lacking iEK (717). 
Both enhancers seemed necessary to support somatic mutation in k 
transgenes (432), although some of this effect may be mediated by 
effects on transcription. In mice with targeted disruption of 3'Ek, 
decreased numbers of K-expressing B cells were observed, as though 
this enhancer contributes to Vk-Jk recombination (278), although in 
transgenic animals harboring constructs capable of Vk-Jk joining, 
the 3'Ek seemed primarily to suppress recombination. In the absence 
of this enhancer, recombination occurred in T cells or prematurely in 
pro-B cells (264). In vivo footprinting studies have shown that 
changes in activity of the two enhancers during B- lineage develop- 
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mcnt are accompanied by changes in occupancy of specific motifs 
by nuclear binding proteins (711,71 8). 

A Enhancers 

For many years tfie X locus frustrated investigators searching for 
an enhancer in the JX-CX intron, but with the recognition of 
enhancers downstream from C genes, attention turned to these 
regions and X enhancers were identified (7 1 9). Highly homologous 
B cell-specific enhancers are located 15.5 kb downstream from 
murine CX4 (the enhancer) and 35 kb downstream from CX\ 
(the Ejlm enhancer). Four functional motifs were identified in each 
enhancer (720). Two, XA and XB, are critical for enhancer function 
in that mutations in either abolish activity, although neither of them 
are active when present in multimerized form in constructs. The XB 
motif binds to PU.l and Pip (721,722), which also bind together in 
the 3'Ek, as discussed above. The XA and XB elements are flanked 
by E box-like motifs, which may bind to HLH proteins active in 
other Ig enhancers. 

Human Ig Enhancers 

The above account of regulatory regions in the three Ig loci has 
focused on the mouse genes because these regions were discovered 
before homologous regions of other species, and they have been 
most thoroughly studied. Sequence analysis of homologous human 
regions have shown a high degree of conservation, especially in the 
core enhancers containing functional transcription factor binding 
motifs; and other enhancer properties — including DNase hyper- 
sensitivity, in vitro protein binding, and functional enhancer activ- 
ity — also have been documented for human Ig enhancers. Some 
differences in enhancer number have resulted from gene duplica- 
tions specific to mice or humans. Thus, in contrast to the two 
murine X enhancers, the human X locus includes a single enhancer 
downstream from CXl (723-725). Conversely, the duplication of 
the two y-y-e-a segments of the human IgH locus has led to dupli- 
cated enhancer complexes downstream from the human Coc genes 
(491,726). Analyses also have been conducted on the human iEK 
(493,671,727-729), the human 3'Ek (140,730), and the human Eu. 
(486,731-734). 

Generalizations Concerning 
Ig Transcriptional Regulation 

Each gene segment within the three Ig loci is regulated by 
nearby DNA regions outside the coding exons. The regulatory 
regions are composed of several motifs that control transcription by 
binding to specific nuclear factors that can stimulate or inhibit 
transcriptional initiation. Some of the motifs are shared between 
different enhancers or promoters and some are unique to one 
region. The presence (or activity) of the nuclear factors in different 
cell types often correlates with the expression of the associated 
gene segment. The research to date seems to have identified many 
components of a complex regulatory machinery, but there are many 
gaps in our understanding. We know little about what regulates the 
nuclear factors — how the known components change in response 
to cell maturation and to external signals such as antigens, 
cytokines and T cells— nor do we understand how the actions of 
these nuclear factors are integrated with other chromosomal 
changes such as histone acetylation, DNA methylation, matrix 
attachment, and nucleosome repositioning. 



APPLIED SCIENCE OF 
IMMUNOGLOBULIN GENES 

Up to this point, this chapter has considered how our current 
knowledge of Ig genes can explain the antibody response. In the 
present section we briefly address several examples of other areas 
to which this knowledge has been applied with interesting results. 

Ig Genes in Lymphoid Malignancies 

Many malignant tumors have been shown to derive from single 
transformed cells that have undergone clonal expansion with fail- 
ure of normal cellular controls. Lymphoid malignancies of the B- 
lineage provide a classic demonstration of clonality because they 
derive from cells with unique genetic material (rearranged Ig 
genes) distinct from the bulk DNA of the same organism. Simi- 
larly, analyses of TCR genes have been valuable in establishing 
clonality of T-cell malignancies. Examination of both gene systems 
is useful in establishing the lineage of neoplasms that lack charac- 
teristic phenotypic markers and in detecting clonal rearrangements 
as a marker for malignancy; a huge clinical literature has accumu- 
lated (735-738). 

The first general strategy to analyze clonality has been to isolate 
genomic DNA from neoplastic and normal tissue from the same 
patient and to examine it for Ig or TCR gene rearrangements using 
Southern blotting. A nonclonal population of B-lymphocytes con- 
tains a mixture of rearrangements so numerous that, after restric- 
tion enzyme digestion and Southern analysis, the rearranged frag- 
ments bearing the JH sequence are spread thinly throughout the 
length of the gel, so no specific rearranged band is detectable. In 
contrast, clonal expansions of a specific rearrangment as in a lym- 
phoid malignancy will produce a distinct rearranged band, 
detectable even when the malignant cells are present as a minority 
in the cell population. Admixture experiments have demonstrated 
that rearranged bands can be detected when malignant clonal cells 
represent as little as 1% of the population, although 5% is a more 
typical detection limit. The Southern blotting technique can be 
used on peripheral lymphocytes as well as on biopsy specimens of 
solid tumors. More recently the power of the PCR to amplify 
miniscule amounts of DNA has made it possible to assess clonal 
rearrangements from samples as small as histologic tissue sections 
or to detect rare leukemic cells in a 10 5 excess of normal cells; such 
assessments are critical for clinically important judgments (739). 
Ig gene rearrangements have been demonstrated in acute lym- 
phoblastic leukemia (ALL), chronic lymphocytic leukemia (CLL), 
multiple myeloma, B-cell follicular and diffuse lymphoma, hairy 
cell leukemia, B-cell prolymphocyte leukemia, HodgkuVs Dis- 
ease, Burkitt's lymphoma, and in the blast crisis of chronic myel- 
ogenous leukemia (738). 

DNA analysis has been particularly revealing in the case of ALL 
cells, which before the advent of DNA analysis were generally 
(80%) of uncertain lineage, lacking both B-andT-phenotypic mark- 
ers. The majority of these null ALL samples analyzed contain 
rearrangements of H-chain Ig genes, and about 40% also have 
rearranged L-chain genes, although no surface Ig is detectable. The 
cells thus typically resemble the pre-B stage of lymphoid develop- 
ment. During the course of the disease some ALL cells show clonal 
evolution of additional Ig gene rearrangements, thus further match- 
ing the pre-B phenotype. When DNA from serial peripheral blood 
samples of pre-B ALL patients are examined by Southern blotting, 
the clonal rearranged band can be used as a marker of leukemic 
remission and relapse (740,741). Ig gene rearrangements also have 
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been sought in T-cell ALL; rearrangement of the Ig H-chain gene 
occurs rarely in these malignancies, and L-chain rearrangement is 
not observed. Conversely, TCR gene rearrangement occurs in 
roughly half of pre-B cell ALL, especially at the TCR-Y locus 
(740). These examples of lineage infidelity may reflect a develop- 
mental stage before definitive commitment to B- or ^differentia- 
tion when both gene systems may be susceptible to the recombi- 
nase machinery. 

Chromosomal Translocations Involving Ig Gene Loci 

In contrast to the rearrangements in tumors discussed so far, 
which represent physiologic recombination events that occurred in 
premalignant progenitor cells, an entirely different kind of Ig gene 
rearrangement has been found in several lymphoid neoplasms: 
rearrangements that appear to have played a role in the malignant 
transformation itself. All except the two most recently discovered 
cases involve genes normally expressed in the B-cell lineage. 

c-myc Translocations in Burkitt's Lymphoma 

The first example to be elucidated— and one that still represents 
a prototype— was in Burkitt's lymphoma, where a consistent pat- 
tern of chromosomal translocation has been observed involving a 
reciprocal exchange between chromosome 8 and either chromo- 
some 14, chromosome 2, or chromosome 22. The latter three chro- 
mosomes contain the three human Ig gene loci (H chain, K, and X) 
(47,742), and the Ig genes were mapped by in situ hybridization to 
the same bands involved in the chromosome 8 translocations. The 
translocation breakpoints have been cloned and sequenced, provid- 
ing a detailed picture of these nonphysiologic recombination prod- 
ucts (Fig. 20). 

The sequence consistently donated from chromosome 8 has been 
found to be the c-myc oncogene, the mammalian cellular homolog 
of the oncogene first identified in avian leukosis virus (743). 
Translocation of c-myc into the IgH locus is also observed in the 
12;15 translocations commonly seen in murine plasmacytomas. 
The CH regions most frequently involved are a (in the mouse) and 
H (in human cells), generally the alleles on the nonexpressed chro- 
mosome. These translocations leave the IgH and c-myc genes 
joined head to head (in opposite orientations). As shown in Fig. 20, 
the first exon of the c-myc gene is commonly absent from the IgH- 
associated translocation product, but because this is a noncoding 
exon, such genes can still encode a functional protein. The site of 
the translocation can vary over a wide distance for both the c-myc 
and IgH genes, which may be separated by more than 100 kb in 
some 14q+ chromosomes. Generally the translocations in Burkitt's 
lines fall into two categories roughly paralleling the two clinical 
forms of the disease: the endemic African type and the sporadic 
type (744).The endemic Burkitt lines seem to represent an early B 
lymphoid stage as they make primarily membrane Ig; these lines 
demonstrate Ig locus breakpoints near V and J regions and c-myc 
breakpoints far 5* of exon 1. However, recent reports that endemic 
lines show evidence of somatic mutation (744a,744b), combined 
with the realization that VDJ recombination may be reactivated in 
the GC (as discussed earlier), have suggested that endemic lines 
may represent later stages in B-cell development than investigators 
initially thought. The less frequent translocations involving K and X 
bring these genes downstream from c-myc and oriented in the same 
5 '-3' direction, as shown in Fig. 20. Analysis of these transloca- 
tions has incidentally provided an assignment of the 5'-3' orienta- 



tion of the normal Ig gene loci with respect to the centromeres on 
their respective chromosomes. 

Several observations suggest that the translocation event bring- 
ing the c-myc gene adjacent to the Ig gene locus participates in 
the malignant transformation of the progenitor lymphocyte. In 
normal cells the Myc protein plays a complex role in regulating 
cell cycle progression, probably through transcriptional activa- 
tion of genes associated with cell division. The Myc protein has 
a structure typical for the bHLH-zip family of HLH transcription 
factors discussed earlier in this chapter. In association with a het- 
erodimer partner named Max, Myc binds in vitro to a typical E- 
box motif CACGTG (745) and regulates several genes that might 
be relevant to its role in regulating proliferation (746,747). In 
Burkitt's lines transcription is generally maintained at relatively 
high steady-state levels, which could contribute to the malignant 
proliferation of these cells. Among somatic cell hybrids con- 
structed between a Burkitt's lymphoma and mouse myeloma, 
human c-myc transcripts were found in hybrid cells containing 
the 14q+ chromosome but not in those with the normal chromo- 
some 8; this finding suggests that translocation of the c-myc gene 
into the Ig locus was responsible for activating its transcription in 
cis (748). This activation could be mediated by proximity to the 
intronic enhancer in some Burkitt's lymphomas. In others, in 
which this intronic enhancer is absent from the translocated c- 
myc locus, c-myc activation might be mediated by the 3'aE and 
associated regulatory regions, which have been shown to be com- 
petent to stimulate transcription from the c-myc promoter (490). 
The potential long-distance regulation implied by the LCR prop- 
erties of these regions may explain the dysregulation of c-myc 
expression in some Burkitt's lines even where the c-myc gene is 
relatively distant from the IgH locus. The observation of T-cell 
leukemias with c-myc translocations into the TCR-a chain locus 
(749) support the generality of deregulated c-myc expression in 
oncogenesis. 



Other Oncogenic Translocations Involving the 
Immunoglobulin Loci 

In addition to the translocations of the c-myc gene on chromo- 
some 8, other translocations involving the IgH locus have been 
reported in lymphoid malignancies, and translocation breakpoints 
have been cloned in the hope of identifying new protooncogenes 
tna t — by analogy with c-myc-might be activated by the translo- 
cation. An 1 1;14 translocation seen in some CLLs and centrocytic 
B-cell lymphomas was found to join the nonexpressed IgH locus 
to a region of chromosome 1 1 that has been termed bcl-\ (B-cell 
leukemia/lymphoma-1) (750). Although attempts to detect a 
deregulated transcription unit in this region were initially unsuc- 
cessful, an oncogene candidate has emerged that was first dis- 
covered as a partner in a different chromosomal rearrangement, 
one involving the parathyroid gene (751). This oncogene, known 
as PRAD-1 (parathyroid adenomatosis- 1) encodes cyclin Dl — a 
regulator of cell division — and maps to the same band (Hql3) 
involved in the translocations with the IgH locus. Cyclin 
Dl/PRAD-l/Bcl-1 transcripts are elevated in several CLL lines 
with bcl-\ translocations (in contrast to other CLLs lacking this 
translocation) and in, approximately 90% of mantle-cell lym- 
phomas (752). 

Another translocation involving chromosome 14 occurs in the 
majority of cases of follicular lymphoma and involves chromo- 
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FIG. 20. Translocations between Ig genes and the c-myc locus observed in lymphoid malignancies. The c-myc gene 
on chromosome 8 (/eft) is associated with translocations resulting from chromosomal breaks either above (top) or 
below (bottom) the gene itself. In the former case, translocation to the IgH locus at 14q32 leads to a 14q+ chromo- 
some bearing c-myc sequence and IgH sequence in opposite orientations {upper right). In the latter case transloca- 
tions with the Ig* locus at 2p1 1 or the Igl locus at 22q1 1 lead to junctions in which the myc and Ig genes are in the 
same orientation. 



'some 18 band q21. Analysis of cloned fragments containing the 
translocation breakpoint led to the identification of a new onco- 
gene designated bcl-2 (753). The 14;18 translocation creates a 
£>c/-2-Ig fusion gene whose mRNA transcripts are elevated as a 
result of both transcriptional deregulation and altered RNA pro- 
cessing. The bcl~2 gene is unusual among protooncogenes in that 
its normal role appears not to be the promotion of cell prolifera- 
tion but rather the inhibition of programmed cell death or apop- 
tosis (754). It was the first identified member of a family of pro- 
teins that form heterodimers that stimulate or inhibit apoptosis 

(755) . The Bcl-2 protein is expressed in tissues where some cell 
populations undergo apoptosis but selected subsets are spared 

(756) . In particular, the protein can be detected in the apical light 
zone of GCs, where it is believed that somatic mutation of Ig 
genes is ongoing and only the subset of B cells displaying surface 
Ig with high binding affinity for antigen can survive. Bcl-2 pro- 
tein is also expressed in surviving T cells in the thymic medulla. 
Transgenic mice with a 6c/-2-Ig minigene developed a lympho- 
proliferative syndrome due to the extended life span of their lym- 



phoid lineage; eventually most mice progressed to a lymphoma 
(757), supporting an important role of the bcl-2-\% translocation 
in malignant transformation. 

A translocation involving the IgH locus and chromosome 19 
band ql3.1 is a recurring but uncommon abnormality in CLL 
(758,759). The cloned translocation breakpoint showed a gene, 
designated bcl-3 (760), that encodes a protein containing seven of 
the ankyrin repeats characteristic of IicB-like proteins, and which 
was discussed earlier in this chapter. The recombination is often 
head to head near Sot in the IgH locus and leads to a marked ele- 
vation of intact Bcl-3 transcripts in CLL lines with the 14; 19 
translocation, as compared with lines without this abnormality. 
The identity of the genes regulated by Bcl-3 is not known, 
although presumably these genes are normally regulated by Rel 
family dimers. 

Bcl-6 was first identified by cloning the breakpoint at the most 
common translocation observed in B-cell non-Hodgkin's lym- 
phoma, that involving the IgH locus and 3q27 (761). Intact bcl-6 
transcripts are increased by the translocation. The 95-kDa DNA- 
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binding Bcl-6 protein has six zinc finger domains and is able to 
mediate strong transcriptional repression, largely as a result of its 
N-terminal POZ domain (762). Bcl-6 transcription is downregu- 
lated on B-cell activation, but the protein is detected at relatively 
high levels in GCs (763). Bcl-6 knockout mice fail to form GCs 
and do not show affinity maturation, but have diffuse inflammation 
with an increase in IgE-bearing lymphocytes (764,765). 

The bcl-7A gene at 12q24.1 (766) and 6c/-8 gene at 15qll-13 
(767) were cloned from breakpoints in IgH translocations in lym- 
phomas. The Bcl-7A sequence appears homologous to the actin- 
binding protein caldesmon, and the 6c/-8 gene is expressed in testis 
and prostate; but little more is known about the function of these 
genes at present. 

Many of the translocations described above can be detected by 
PCR as clonally unique amplification products, which can be used 
as markers for minimal residual disease (768). 

Hybrid Recombinations 

Another class of aberrant chromosomal rearrangements involv- 
ing the IgH locus includes the chromosome 14 inversions observed 
in some T-cell lymphomas (769,770). These remarkable rearrange- 
ments occur between the IgH locus at 14q23 and the TCR-ot locus 
at 14q t 1 and clearly appear to have been mediated by the V assem- 
bly recombinase. In one well-studied example, an Ig VH segment is 
joined to a TCR Ja segment on the telomeric end of the chromo- 
some whereas in the centromeric region a signal joint is found; 
because this joint is not reciprocal to the VH-Ja coding joint on the 
same chromosome, at least two recombination steps must have 
occurred. In these chromosomal inversions no oncogene sequence 
seems to be involved, so their relationship to the malignancy is 
uncertain. By PCR, similar hybrid antigen receptor recombinations 
can be detected at low levels in normal individuals; these recombi- 
nations are present in higher than normal frequencies in patients 
with ataxia telangectasia and in agricultural workers exposed to 
chemicals, a population with increased risk for lymphoid malig- 
nancy (771). 



Abnormal Ig Gene Loci in Disease 

Several immune deficiency diseases are associated with selective 
or global decrease in serum Ig levels. Although it might have been 
expected that elucidation of the Ig gene loci would clarify the mol- 
ecular basis for these diseases, very few examples of genetic defects 
in Ig genes have been reported. Indeed, examination of restriction 
fragment length polymorphisms (RFLPs) detected with probes m 
the IgH locus have indicated that at least two genetic defects asso- 
ciated with Ig H chains (familial selective IgA deficiency and the 
hyper-IgM syndrome) are not linked to the IgH genes. Many Ig 
deficiencies are undoubtedly caused by cellular abnormalities in the 
complex mechanisms of B-cell development, T-cell interaction, 
lymphokine response, antigen triggering, and so forth. 

In one of the rare examples of an Ig deficiency due to Ig gene 
mutation, defects in the Ck genes have been described in a patient 
with selective deficiency of K synthesis (772). Different point 
mutations in each of the Ck alleles of a patient were observed, lead- 
ing to amino acid replacements that could have disturbed the 
intradomain disulfide bonds critical to Ig structure. 

In the H-chain locus six large deletions have been described in 
humans (Fig. 7), the largest involving the loss of the yU Vel, al, 
y2 and fX genes (55); despite the complete absence of the corre- 



sponding H chains in the serum, individuals with homozygous 
deletions generally show no clinical evidence of immunodeficiency. 
However, homozygous defects in the \i gene can result in agamma- 
globulinemia (772a). 

The contribution that specific polymorphic V genes might make 
to autoimmune disease has been explored by several investigators. 
Although human V gene sequences are remarkably well conserved 
between individuals, it is known that human V loci display large 
insertion/deletion polymorphisms in the population (381,382, 
773-777). Extrapolating from this fact, it might be supposed that 
the presence of specific unusual germline V genes could increase 
the risk for certain autoimmune responses, much as certain MHC 
haplotypes are associated with increased risk for such diseases. 
However, the genes expressed in autoimmune antibodies have not 
proved to be rare in the population; and although some disease 
associations with V haplotypes have been reported (778,779), 
genetic variation in V genes does not appear to be a major risk 
factor. Specific immune defects due to the absence of specific V 
regions are also possible. In Navajos a defective copy of the VkA2 
gene _ w hich encodes the predominant Vk chain in antibodies 
against Haemophilus influenzae— has been suggested as possibly 
contributing to the high susceptibility in this population to infec- 
tions with this bacterium (780). 

Genetic Engineering of Ig Genes 

Using the considerable knowledge of Ig genes that has been 
gained from modern molecular cloning techniques, a number of 
investigators have been exploiting these genes as bioengineering 
tools for various basic research and applied science goals. Although 
a detailed treatment of these studies is beyond the scope of this 
chapter, we will briefly consider a few of the more interesting ideas. 

One basic research goal is the exploration of structure-function 
relationships of Ig molecules by engineered modifications of Ig 
structure. The IgM molecule has been studied in this way initially 
by exploiting natural mutant hybridoma lines making abnormal 
antibodies. The abnormal |X genes were cloned and sequenced, and 
observed mutations that were candidates for causing the phenotypic 
abnormality were either reverted or reintroduced into normal genes 
by site-directed mutagenesis to verify their effects. Using this 
approach, a 39-bp deletion near the C-terminal end of the molecule 
was found to prevent pentamer formation (781) and replacement at 
codon 436 in the CH3 domain was found to depress complement- 
activated cytolysis (782). Sequences responsible for differential 
complement activation by human y isotypes were identified by 
exchanging various residues from one isotype to another and fol- 
lowing the resulting effects on complement activation (783). In an 
analysis of the mouse 72b H chain, systematic alteration of ammo 
acids on the surface of the CH2 domain by in vitro mutagenesis led 
to the identification of three residues critical for the binding of the 
complement factor Clq (784). A stmcture-function analysis ofthee 
H chain was undertaken by testing e chain fragments— generated by 
bacterial expression of engineered segments of the e gene 
sequence— for their biologic activity. A 76~amino acid fragment 
spanning the CH2-CH3 boundary was found to bind to mast cells in 
vitro and to inhibit the action of IgE in vivo (Prausnitz-Kustner reac- 
tion) (785). For V-region stnicture-function analysis, site-directed 
mutagenesis of V . regions has been used to study the determinants 
of antibody affinity and specificity (786). These studies suggest that 
biotechnology can provide powerful methods for analyzing impor- 
tant features of Ig protein structure. 
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When structure-function relationships are sufficiently under- 
stood, the next logical engineering challenge is to improve on 
nature, designing Ig molecules with specific desired properties by 
modifying appropriate segments of Ig genes. One goal has been to 
combine the advantages of human and murine monoclonal anti- 
bodies to make medically useful products. Murine hybridomas 
grow quickly, produce large amounts of antibody and are quite sta- 
ble relative to human hybridomas, which are generally poor in all 
three respects. Yet for many applications — like the use of antitumor 
antibodies in human patients — the more easily generated mouse 
monoclonals would be unsatisfactory because of their immuno- 
genicity and their relative inefficiency in generating C region- 
dependent effector functions (such as complement fixation and 
antibody-dependent cellular cytotoxicity). A solution that has been 
tested by several laboratories is to construct chimeric genes linking 
a human C-region gene to a murine V region cloned from a mouse 
hybridoma generated against the antigen of interest; these con- 
structs are then transfected into nonsecreting variants of mouse 
hybridomas yielding transfectomas that secrete humanized anti- 
bodies with murine V regions and human C regions (787-789). To 
reduce irnmunogenicity arising from the murine V-region 
sequences, the CDRs from a murine antibody of desired specificity 
can be grafted onto human V-region framework sequences 
(790,791). A completely different approach to obtain human anti- 
bodies using murine hybridoma technology involves engineering 
mice to express human antibodies (792). This ambitious goal was 
achieved starting with mice whose endogenous Ig k and IgH gene 
loci were disrupted by homologous recombination. ES cells from 
these mice were then fused with yeast spheroplasts containing YAC 
constructs of human DNA. The resulting "xenomice" bear 66 VH 
regions, about 80% of the human VH repertoire, and the complete 
DH, JFH, u,, 5, and 72 C-region genes, including intronic and 3 'a 
enhancers. The transferred K locus contains most of the proximal 
part of the Vk locus (32 Vk genes), Jks, and Ck as well as both the 
intronic and 3' enhancers and the kde. The human genes support 
grossly normal development of murine B cells, most of which 
secrete exclusively human Ig; about 15% of B cells express human 
H chain with mouse X chain. The human gene loci support antigen- 
specific antibody responses to immunization, demonstrating iso- 
type switching and somatic mutation. Xenomice can be used to 
generate hybridomas that secrete human Ig but that provide ail the 
advantages associated with their murine origin. 

As further variations on Ig structure, bioengineers have designed 
antigen-binding molecules that do not require combining separate 
proteins containing the L-chain and H-chain V regions. One 
approach is to exploit camel Igs, which lack L chains but still can 
bind antigens efficiently (793). A more widely studied strategy 
involves single-chain Fv proteins (794,795); these can be obtained 
from gene constructs encoding hybridoma-derived Vk and VH 
domains connected by a flexible synthetic linker of about 15 amino 
acids that allows these two domains to associate via the same pro- 
tein-protein interactions that hold them together in a normal anti- 
body. The two domains thus form a composite antigen-binding 
structure that often retains the specificity and affinity of its parent 
monoclonal antibody. 

The paradigm has facilitated another engineering advance: a 
scheme for generating monoclonal antibodies without hybridoma 
fusions. Libraries of amplified Vk and VH regions are cloned 
together into the same filamentous phage vector, which is designed 
to express both V regions as an Fv fusion protein on the outer sur- 
face of the phage. Such phage display libraries can then be selected 



for antigen binding by passage over an antigen-containing affinity 
column (796) or by successive precipitations with antigen (797). 
Phage clones selected for antigen binding contain the Vk and VH 
genes encoding an effective antigen-binding domain. In several 
libraries of 10 6 clones, antigen-binding phage could be found only 
if the V-region sequences were derived from B cells that were 
obtained after immunization, but for considerably larger libraries, 
prior immunization is not necessary. Once an antigen-binding 
clone is obtained, it can be subjected to random mutagenesis and 
further rounds of selection to obtain higher affinity antigen binding 
(798). Mutagenesis can be achieved by chemical mutagens, by 
error prone PCR, by shuffling Vk and VH chains between con- 
structs, by passage through a mutator strain of bacteria or by a 
strategy of codon-based mutagenesis (799). A dramatically effec- 
tive mutagenesis strategy with great promise for exploring the 
sequence space of antibody structures allows shuffling of muta- 
tions at different positions within a protein sequence to assemble 
various combinations of mutations before selection (800). Once a 
combination of high-affinity Vk and VH have been selected, the 
individual Vk and VH regions can be subcloned and inserted back 
into appropriate expression vectors to generate Ig molecules 
(801,802). It is debatable whether antibody sequences obtained by 
the phage display library strategy are typical of natural antibodies, 
but it is clear that the technology has general utility for producing 
high-affinity monoclonal antibodies of various specificities with- 
out hybridomas. 

Bioengineering technologies have been used to alter the natural 
sequences of both C and V regions to obtain proteins with particu- 
lar properties. Tinkering with C-region sequences can improve 
function of engineered antibodies. For example, in the CH3 
domain of a y\ monoclonal, replacement of a serine residue by cys- 
teine led to dimerization and a dramatic improvement in function 
of an anuleukemia antibody (803). One particularly interesting use 
of V-region engineering is the design of antibodies with catalytic 
activity. Enzymes are thought to catalyze reactions in part by 
reducing the activation energy — that is, stabilizing activated transi- 
tion-state intermediates by strong binding interactions. Several 
groups have shown that, by a similar mechanism, antibodies 
directed against a molecule resembling the transition state of a 
chemical reaction can catalyze that reaction (804,805). Catalysis 
also can be achieved by Fv proteins (806). Site-directed mutations 
of V-region sequences can be used to analyze and enhance the cat- 
alytic activity (807). Indeed, if the antibody catalysis can be engi- 
neered to replace an in vivo loss mutation of an essential enzyme, 
random mutations of the antibody gene can be selected in vivo for 
improved activity (808). 

In another avenue of Ig engineering, antibody domains have 
been added to unrelated peptide sequences in order to confer some 
desired Ig function to a different protein. Most commonly the V 
region is used to direct the unrelated polypeptide to a specific tar- 
get For example, in an attempt to improve the potency and speci- 
ficity of tissue plasminogen activator (t-PA, an enzyme useful in 
dissolving clots in heart attack victims) an antif ibrin antibody was 
linked to t-PA sequence to focus the plasminogen activation on fib- 
rin clots (809). Other similar engineering projects have linked 
staphylococcal nuclease and Escherichia coli DNA polymerase 
functions to Ig molecules. A major area of applied research 
explores Ig-toxin hybrids, which offer the potential of delivering 
potent toxins to specific targets, especially cancer cells (810). A 
related strategy is to link a V region of one specificity to an Ig of a 
second specificity, creating bispecific antibodies or diabodies. Bis- 
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pccific antibodies have a wide range of uses, including targeting T 
cells (via anti-CD3) to cells bearing a particular antigen that can be 
recognized by an antibody rather than by a TCR (811); they also 
can be used in immunoassays (812). Bispecific antibodies can be 
generated by transfecting the two L-chain and two H-chain genes 
into a single producer cell, or by fusing cells producing two anti- 
bodies; in either case the desired bispecific protein must then be 
purified from the resultant mixture of components. Alternatively, 
bispecific antibodies can be genetically engineered as two Fv pro- 
teins joined by a linker chain of amino acids (813). 

Although the above examples represent uses of Ig V regions, Ig 
C regions also have been exploited, often fused to unrelated pro- 
teins that have their own targeting properties; such fusion proteins 
are known as immunoadhesins. In such constructs the Ig constant 
domains can confer multivalency, increased stability, and effector 
functions (e.g., binding to Fc receptors) that can be useful for cer- 
tain applications (814). For example, the receptor for most human 
rhinoviruses is ICAM-1, and a soluble form of this protein might 
act as a decoy receptor to block infection. An ICAM-1 molecule 
fused to Ig H-chain domains was a much more efficient inhibitor 
of infection than ICAM-1 alone (815). 

Antibody gene constructs have been expressed in a variety of 
production systems, including B-lymphoid lines (transfectomas), 
other mammalian lines (e.g., COS cells), and bacteria. More recent 
experiments have involved expression in insect cells (816) and in 
plants (817), where they could theoretically be ingested in an unpu- 
rified state to confer passive mucosal immunotherapy. Ig gene con- 
structs also have been designed so that the antibody is not secreted, 
but instead binds (as an "intrabody") to intracellular targets (818). 

The strategies described above have used the coding sequences 
of Ig genes, but regulatory sequences also have been used in bio- 
engineering projects, primarily to obtain B cell-specific expression 
of foreign gene constructs introduced nonspecifically into multiple 
cell types. Transgenes, for example, are present in every cell type, 
but transgenic constructs linking the \i enhancer to the c-myc gene 
have induced malignancies specific to the B-cell lineage, in which 
this enhancer is active (819,820). By similar logic, a retroviral con- 
struct containing an intracellular toxin (like diphtheria toxin) pro- 
grammed for B cell-specific expression might be used to treat B- 
cell malignancies (821). 

CONCLUSION 

Recombinant DNA technology has revolutionized the study of 
the antibody response. Initial investigations used powerful cloning 
and sequencing methods to define the structure of the Ig genes as 
they exist in the germline and in actively secreting B-lymphocytes. 
More recent studies have probed the mechanisms of the processes 
unique to these genes, i.e., rearrangements and somatic mutation. 
These more difficult questions will represent a challenge for a long 
time to come, although the recent experiments yielding VDJ and 
switch recombination reactions in cell-free extracts can be 
expected to yield valuable clues to the mechanisms of these 
processes. Meanwhile the knowledge already gained about Ig 
genes is being applied to many clinical and scientific endeavors 
that hold promise for exciting advances in the near future. 
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